Random sampling

Random sampling#

These are basic exercises testing your understanding of generating random samples using numpy

Imports and a function to help with plotting samples are provided.

Imports#

import numpy as np
import matplotlib.pyplot as plt

Plotting function#

def distribution_plot(samples, bins=100, figsize=(5,3)):
    '''
    helper function to visualise the distributions
    
    Params:
    -----
    samples: np.ndarray
        A numpy array of quantitative data to plot as a histogram.
        
    bins: int, optional (default=100)
        The number of bins to include in the histogram
        
    figsize: (int, int)
        Size of the plot in pixels
        
    Returns:
    -------
        fig, ax: a tuple containing matplotlib figure and axis objects.
    '''
    hist = np.histogram(samples, bins=np.arange(bins), 
                        density=True)
    
    fig = plt.figure(figsize=figsize)
    ax = fig.add_subplot()
    _ = ax.plot(hist[0])
    _ = ax.set_ylabel('p(x)')
    _ = ax.set_xlabel('x')
    
    return fig, ax

Task 1#

  • Create a numpy random number Generator object.

  • Draw 1,000,000 samples the Uniform distribution with parameters low=20, high=80

  • Use the provided distribution_plot function to check your sample.

Hints

rng = np.random.default_rng()
samples = rng.uniform(20, 80, size=1_000_000)
_ = distribution_plot(samples, bins=100)
../../_images/2719ad76022683c1570d05235dfba9a869a1154b76e2d1286c7629f308feed21.png

Task 2:#

  • Repeat the example given above, but this time set a random seed

  • Try the random seed 42.

  • Try a few different seeds to check your code.

rng = np.random.default_rng(42)
samples = rng.uniform(20, 80, size=1_000_000)

# look at first 10 samples
samples[:10]
array([66.43736291, 46.33270639, 71.51587519, 61.84208174, 25.65064087,
       78.5373411 , 65.66838212, 67.16385832, 27.68681796, 47.02315627])
# repeat and double check
rng = np.random.default_rng(42)
samples = rng.uniform(20, 80, size=1_000_000)

# look at first 10 samples
samples[:10]
array([66.43736291, 46.33270639, 71.51587519, 61.84208174, 25.65064087,
       78.5373411 , 65.66838212, 67.16385832, 27.68681796, 47.02315627])