bigframes.pandas.read_json#

bigframes.pandas.read_json(path_or_buf: str | IO[bytes], *, orient: Literal['split', 'records', 'index', 'columns', 'values', 'table'] = 'columns', dtype: Dict | None = None, encoding: str | None = None, lines: bool = False, engine: Literal['ujson', 'pyarrow', 'bigquery'] = 'ujson', write_engine: Literal['default', 'bigquery_inline', 'bigquery_load', 'bigquery_streaming', 'bigquery_write', '_deferred'] = 'default', **kwargs) DataFrame[source]#

Convert a JSON string to DataFrame object.

Note

using engine=”bigquery” will not guarantee the same ordering as the file. Instead, set a serialized index column as the index and sort by that in the resulting DataFrame.

Note

For non-bigquery engine, data is inlined in the query SQL if it is small enough (roughly 5MB or less in memory). Larger size data is loaded to a BigQuery table instead.

Examples:

>>> import bigframes.pandas as bpd
>>> gcs_path = "gs://bigframes-dev-testing/sample1.json"
>>> df = bpd.read_json(path_or_buf=gcs_path, lines=True, orient="records")
>>> df.head(2)
   id   name
0   1  Alice
1   2    Bob

[2 rows x 2 columns]
Parameters:
  • path_or_buf (a valid JSON str, path object or file-like object) – A local or Google Cloud Storage (gs://) path with engine=”bigquery” otherwise passed to pandas.read_json.

  • orient (str, optional) –

    If engine=”bigquery” orient only supports “records”. Indication of expected JSON string format. Compatible JSON strings can be produced by to_json() with a corresponding orient value. The set of possible orients is:

    • 'split'dict like

      {{index -> [index], columns -> [columns], data -> [values]}}

    • 'records'list like

      [{{column -> value}}, ... , {{column -> value}}]

    • 'index' : dict like {{index -> {{column -> value}}}}

    • 'columns' : dict like {{column -> {{index -> value}}}}

    • 'values' : just the values array

  • dtype (bool or dict, default None) –

    If True, infer dtypes; if a dict of column to dtype, then use those; if False, then don’t infer dtypes at all, applies only to the data.

    For all orient values except 'table', default is True.

  • encoding (str, default is 'utf-8') – The encoding to use to decode py3 bytes.

  • lines (bool, default False) – Read the file as a json object per line. If using engine=”bigquery” lines only supports True.

  • engine ({{"ujson", "pyarrow", "bigquery"}}, default "ujson") – Type of engine to use. If engine=”bigquery” is specified, then BigQuery’s load API will be used. Otherwise, the engine will be passed to pandas.read_json.

  • write_engine (str) – How data should be written to BigQuery (if at all). See bigframes.pandas.read_pandas() for a full description of supported values.

  • **kwargs – keyword arguments for pandas.read_json when not using the BigQuery engine.

Returns:

The DataFrame representing JSON contents.

Return type:

bigframes.pandas.DataFrame

Raises: