bigframes.pandas.read_json#
- bigframes.pandas.read_json(path_or_buf: str | IO[bytes], *, orient: Literal['split', 'records', 'index', 'columns', 'values', 'table'] = 'columns', dtype: Dict | None = None, encoding: str | None = None, lines: bool = False, engine: Literal['ujson', 'pyarrow', 'bigquery'] = 'ujson', write_engine: Literal['default', 'bigquery_inline', 'bigquery_load', 'bigquery_streaming', 'bigquery_write', '_deferred'] = 'default', **kwargs) DataFrame[source]#
Convert a JSON string to DataFrame object.
Note
using engine=”bigquery” will not guarantee the same ordering as the file. Instead, set a serialized index column as the index and sort by that in the resulting DataFrame.
Note
For non-bigquery engine, data is inlined in the query SQL if it is small enough (roughly 5MB or less in memory). Larger size data is loaded to a BigQuery table instead.
Examples:
>>> import bigframes.pandas as bpd
>>> gcs_path = "gs://bigframes-dev-testing/sample1.json" >>> df = bpd.read_json(path_or_buf=gcs_path, lines=True, orient="records") >>> df.head(2) id name 0 1 Alice 1 2 Bob [2 rows x 2 columns]
- Parameters:
path_or_buf (a valid JSON str, path object or file-like object) – A local or Google Cloud Storage (gs://) path with engine=”bigquery” otherwise passed to pandas.read_json.
orient (str, optional) –
If engine=”bigquery” orient only supports “records”. Indication of expected JSON string format. Compatible JSON strings can be produced by
to_json()with a corresponding orient value. The set of possible orients is:'split'dict like{{index -> [index], columns -> [columns], data -> [values]}}
'records'list like[{{column -> value}}, ... , {{column -> value}}]
'index': dict like{{index -> {{column -> value}}}}'columns': dict like{{column -> {{index -> value}}}}'values': just the values array
dtype (bool or dict, default None) –
If True, infer dtypes; if a dict of column to dtype, then use those; if False, then don’t infer dtypes at all, applies only to the data.
For all
orientvalues except'table', default is True.encoding (str, default is 'utf-8') – The encoding to use to decode py3 bytes.
lines (bool, default False) – Read the file as a json object per line. If using engine=”bigquery” lines only supports True.
engine ({{"ujson", "pyarrow", "bigquery"}}, default "ujson") – Type of engine to use. If engine=”bigquery” is specified, then BigQuery’s load API will be used. Otherwise, the engine will be passed to pandas.read_json.
write_engine (str) – How data should be written to BigQuery (if at all). See
bigframes.pandas.read_pandas()for a full description of supported values.**kwargs – keyword arguments for pandas.read_json when not using the BigQuery engine.
- Returns:
The DataFrame representing JSON contents.
- Return type:
- Raises:
bigframes.exceptions.DefaultIndexWarning – Using the default index is discouraged, such as with clustered or partitioned tables without primary keys.
ValueError –
linesis only valid whenorientisrecords.