numpy
  1. numpy-numpyhistogram

numpy.histogram()

The numpy.histogram() function is a widely-used NumPy function that can be used to compute the histogram of a dataset. Given an input array and a set of bins, numpy.histogram() returns the number of data points that fall into each bin.

Syntax

The basic syntax for using numpy.histogram() function is as follows:

numpy.histogram(a, bins=10, range=None, normed=False, weights=None, density=None)

Here,

  • a represents the input data array, for which the histogram has to be computed.
  • bins specifies the number of bins to be used in the histogram. It can be an integer or an array of bin edges.
  • range specifies the lower and upper range of the bins. It can be a tuple of (lower, upper) range values.
  • normed and density are deprecated and replaced by the density argument.
  • weights represents an array with weights associated with the input data.

Example

Consider the following example, which demonstrates how to use numpy.histogram() to compute the histogram of a dataset:

import numpy as np
import matplotlib.pyplot as plt

data = np.random.normal(size=100)
hist, bins = np.histogram(data, bins=10, range=(-5, 5))

plt.hist(data, bins=bins)

plt.show()

In this example, numpy.random.normal() is used to generate a sample of 100 normally distributed random numbers. The numpy.histogram() function is then used to compute the histogram of the data, using 10 bins between -5 and 5. Finally, the histogram is plotted using matplotlib.pyplot.hist().

Output

The output of the numpy.histogram() function is a tuple containing two arrays:

  • hist: an array that contains the frequency counts of the data points that fall into each bin.
  • bins: an array that contains the bin edges.

Explanation

The numpy.histogram() function computes the histogram of a given dataset using a set of predefined bins. It calculates the frequency of values that fall within each bin and returns the frequency counts and the bin edges. The output can then be used to plot the histogram of the dataset.

Use

The numpy.histogram() function is a popular way to visualize and analyze datasets in a histogram format. It can be used to study the distribution of data in a dataset and to detect any outliers or abnormalities in the data.

Important Points

  • The bins parameter of numpy.histogram() can take multiple forms. It can be an integer, providing the number of bins; a sequence, providing the bin edges, including the left edge of the first bin and the right edge of the last bin; or a string that specifies the method to be used to calculate the bins.
  • The density=True parameter of plt.hist() can be used to plot the histogram as a probability density rather than just raw frequencies.
  • It is possible to stack multiple histograms on top of one another using the stacked=True parameter of plt.hist().

Summary

The numpy.histogram() function is a convenient and widely-used way to compute and visualize the histogram of a dataset. It computes the frequency of values that fall within a set of predefined bins and returns the frequency counts and the bin edges. The output can then be used to plot the histogram of the dataset using matplotlib.pyplot.hist(). The numpy.histogram() function is useful for studying the distribution of data in a dataset and for detecting any outliers or abnormalities in the data.

Published on: