Pandas functions to export merged data frames to various formats became important since after merging in new frame there are repetitive identical values. Therefore, I need to remove multiindexing enumeration rows column and delete identical repetitive lines.
There is bug wheather on my PC or on Pandas function is deprecated, since
index_col=None function is now working.
I test while creating data frames still csv generates multindex rows first enumaration column.
pandas.read_table
Column to use as the row labels of the DataFrame. If a sequence is given, a MultiIndex is used. If you have a malformed file with delimiters at the end of each line, you might consider index_col=False to force pandas to _not_ use the first column as the index (row names)
So I used pdf.read_table to import text files used delimited "," specification for separation and data frames merge function (still to test best merging function). Pandas brings automation possibilities to constantly update Groups data membership and rules.
On stackoverflow on found right coding to export csv without first row, the right code is result.to_csv('3.csv', index=False).
There is interesting functionality can read into data frame html code. See Pandas API Reference:
read_html (io[, match, flavor, header, ...]) | Read HTML tables into a list of DataFrame objects. |
Here is the right code t export text file result.to_csv(r'pandas.txt', index=None, mode='a') . Mode 'a' is for append text file. So now I can remove repetivtie identical files. There is another capabilities of Pandas to match with code similar by text lines.
Pandas link to_csv function
https://pandas.pydata.org/pandas docs/stable/generated/pandas.Series.to_csv.html
And the final code to export dataframe free from repetitive identical lines is df2.to_csv('facebook_groups_members.csv', index=None). Keep in mind that I did not spend much time reading Pandas documentation. It is worth to do it, to do it since you can revolutionize marketing or economic researches since you can download html to data frame, or scrap with scrapy spider tabel site and process it with python to data frame. And then you code stats analysis.
And the final code to export dataframe free from repetitive identical lines is df2.to_csv('facebook_groups_members.csv', index=None). Keep in mind that I did not spend much time reading Pandas documentation. It is worth to do it, to do it since you can revolutionize marketing or economic researches since you can download html to data frame, or scrap with scrapy spider tabel site and process it with python to data frame. And then you code stats analysis.
No comments:
Post a Comment
Note: Only a member of this blog may post a comment.