While programming and making test you may have perfectly performing programs. But if you add data from various sources pro use libraries like pandas or other which encode or decode text, your programs using text file data may may not perform tasks as programmed. One problem I encountered that pohmeliy.com bot does not read see text data urls which passed through pandas encoding interpreter which in turn enodes everything to UTF-8, which I solved by downloading list using pohmeliy.com tolist bot. And another problem some url lines attaches each to other hindering programmed performance or making list of urls unreadable. Encoding and encoding methods detection is universal problem. It is useful to read for understanding What Every Programmer Absolutely, Positively Needs To Know About Encodings And Character Sets To Work With Text to understand how text works with encodings. To detect text encodings I use python chardect
Chardet: The Universal Character Encoding Detector which detects some encodings standards and some variants with detection confidenc. Today I tested I got:
C:\Users\ANTRAS>chardetect C:\Pohmeliy_FB\FB_post_to_groupsCR\lists\CRGroups.txt C:\Pohmeliy_FB\FB_post_to_groupsSILK\lists\SILKGroups.txt
C:\Pohmeliy_FB\FB_post_to_groupsCR\lists\CRGroups.txt: ascii with confidence 1.0
C:\Pohmeliy_FB\FB_post_to_groupsSILK\lists\SILKGroups.txt: ascii with confidence 1.0
So you you have to find one data source or decode and encode to same byte order variant. About Byte order mark you can read on Wikipedia.org .
Small business requires many tasks to be completed on PC. With python language you need to learn to integrate many libraries which can run your daily tasks or to code own solutions to outpace competitors. There are billions lines of python codes, so the limit is integrating knowledge and efficiency. To run business you can predict demand, use artificial intelligence to bring leads, automate routines tasks, integrate other programs. From salesmen to scientists use coding. Become curious!
Search This Blog
Sunday, November 19, 2017
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment
Note: Only a member of this blog may post a comment.