This package allows easy data flow between a worksheet in a Google spreadsheet
and a Pandas DataFrame. Any worksheet you can obtain using the gspread
package
can be retrieved as a DataFrame with get_as_dataframe
; DataFrame objects can
be written to a worksheet using set_with_dataframe
:
import pandas as pd
from gspread_dataframe import get_as_dataframe, set_with_dataframe
worksheet = some_worksheet_obtained_from_gspread_client
df = pd.DataFrame.from_records([{'a': i, 'b': i * 2} for i in range(100)])
set_with_dataframe(worksheet, df)
df2 = get_as_dataframe(worksheet)
The get_as_dataframe
function supports the keyword arguments
that are supported by your Pandas version's text parsing readers,
such as pandas.read_csv
. Consult your Pandas documentation for a full list of options. Since the 'python'
engine in Pandas is used for parsing,
only options supported by that engine are acceptable:
import pandas as pd
from gspread_dataframe import get_as_dataframe
worksheet = some_worksheet_obtained_from_gspread_client
df = get_as_dataframe(worksheet, parse_dates=True, usecols=[0,2], skiprows=1, header=None)
New in version 4.0.0: drop_empty_rows
and drop_empty_columns
parameters, both True
by default, are now accepted by get_as_dataframe
. If you created a Google sheet with the default
number of columns and rows (26 columns, 1000 rows), but have meaningful values for the DataFrame
only in the top left corner of the worksheet, these parameters will cause any empty rows
or columns to be discarded automatically and absent from the returned DataFrame.
If you install the gspread-formatting
package, you can additionally format a Google worksheet to suit the
DataFrame data you've just written. See the package documentation for details, but here's a short example using the default formatter:
import pandas as pd
from gspread_dataframe import get_as_dataframe, set_with_dataframe
from gspread_formatting.dataframe import format_with_dataframe
worksheet = some_worksheet_obtained_from_gspread_client
df = pd.DataFrame.from_records([{'a': i, 'b': i * 2} for i in range(100)])
set_with_dataframe(worksheet, df)
format_with_dataframe(worksheet, df, include_column_header=True)
- Python 2.7, 3+
- gspread (>=3.0.0; to use older versions of gspread, use gspread-dataframe releases of 2.1.1 or earlier)
- Pandas >= 0.24.0
pip install gspread-dataframe
git clone https://github.com/robin900/gspread-dataframe.git
cd gspread-dataframe
python setup.py install