bigframes.pandas.DataFrame.scatter#

DataFrame.scatter(x: Hashable | None = None, y: Hashable | None = None, s: Hashable | Sequence[Hashable] = None, c: Hashable | Sequence[Hashable] = None, **kwargs)[source]#

Create a scatter plot with varying marker point size and color.

This function calls pandas.plot to generate a plot with a random sample of items. For consistent results, the random sampling is reproducible. Use the sampling_random_state parameter to modify the sampling seed.

Examples:

Let’s see how to draw a scatter plot using coordinates from the values in a DataFrame’s columns.

>>> import bigframes.pandas as bpd
>>> df = bpd.DataFrame([[5.1, 3.5, 0], [4.9, 3.0, 0], [7.0, 3.2, 1],
...                    [6.4, 3.2, 1], [5.9, 3.0, 2]],
...                   columns=['length', 'width', 'species'])
>>> ax1 = df.plot.scatter(x='length',
...                       y='width',
...                       c='DarkBlue')

And now with the color determined by a column as well.

>>> ax2 = df.plot.scatter(x='length',
...                       y='width',
...                       c='species',
...                       colormap='viridis')
Parameters:
  • x (int or str) – The column name or column position to be used as horizontal coordinates for each point.

  • y (int or str) – The column name or column position to be used as vertical coordinates for each point.

  • s (str, scalar or array-like, optional) –

    The size of each point. Possible values are:

    • A string with the name of the column to be used for marker’s size.

    • A single scalar so all points have the same size.

  • c (str, int or array-like, optional) –

    The color of each point. Possible values are:

    • A single color string referred to by name, RGB or RGBA code, for instance ‘red’ or ‘#a98d19’.

    • A column name or position whose values will be used to color the marker points according to a colormap.

  • sampling_n (int, default 100) – Number of random items for plotting.

  • sampling_random_state (int, default 0) – Seed for random number generator.

  • **kwargs – Additional keyword arguments are documented in DataFrame.plot().

Returns:

An ndarray is returned with one matplotlib.axes.Axes per column when subplots=True.

Return type:

matplotlib.axes.Axes or np.ndarray of them