Consensus.Nomis module

API keys and connecting to NOMIS

Get a NOMIS api key by registering with NOMIS <https://www.nomisweb.co.uk>. When initialising the DownloadFromNomis class, provide the api key as a parameter to the api_key argument. If you need proxies to access the data, provide the information as a dictionary to proxies:

api_key = '02bdlfsjkd3idk32j3jeaasd2'                                # this is an example of an API key
proxies = {'http': your_proxy_address, 'https': your_proxy_address}  # proxy dictionary must follow this pattern. If you only have http proxy, copy it to the https without changing it
nomis = DownloadFromNomis(api_key=api_key, proxies=proxies)
nomis.connect()

Alternatively, you can use the ConfigManager to store API keys:

dotenv_path = Path('.env')
load_dotenv(dotenv_path)
api_key = environ.get("NOMIS_API")
proxy = environ.get("PROXY")

conf = ConfigManager()
conf.save_config({"nomis_api_key": api_key, "proxies.http": proxy, "proxies.https": proxy})

Example usage

from Consensus.Nomis import DownloadFromNomis
from Consensus.ConfigManager import ConfigManager
from dotenv import load_dotenv
from pathlib import Path
from os import environ

# get your API keys and proxy settings from .env file
dotenv_path = Path('.env')  # assuming .env file is in your working directory
load_dotenv(dotenv_path)
api_key = environ.get("NOMIS_API")  # assuming you've saved the API key to a variable called NOMIS_API
proxy = environ.get("PROXY") # assuming you've saved the proxy address to a variable called PROXY

# set up your config.json file - only necessary the first time you use the package
config = {
          "nomis_api_key": api_key,  # the key for NOMIS must be 'nomis_api_key'
          "proxies.http": proxy,  # you may not need to provide anything for proxy
          "proxies.https": proxy  # the http and https proxies can be different if your setup requires it
          }
conf = ConfigManager()
conf.save_config()

# establish connection
nomis = DownloadFromNomis()
nomis.connect()

# print all tables from NOMIS
nomis.print_table_info()

# Get more detailed information about a specific table. Use the string starting with NM_* when selecting a table.
# In this case, we choose TS054 - Tenure from Census 2021:
nomis.detailed_info_for_table('NM_2072_1')  #  TS054 - Tenure

# If you want the data for the whole country:
df_bulk = nomis.bulk_download('NM_2072_1')
print(df_bulk)

# And if you want just an extract for a specific geography, in our case England:
geography = {'geography': ['E92000001']}  # you can extend this list
df_england = nomis.download('NM_2072_1', params=geography)
print(df_england)
class Consensus.Nomis.ConnectToNomis(api_key=None, proxies=None)

Bases: object

Class to connect and retrieve data from the NOMIS API.

api_key

Attribute. NOMIS API key.

Type:

str

proxies

Attribute. HTTP and HTTPS proxy addresses as a dictionary {‘http’: http_addr, ‘https’: https_addr}.

Type:

Dict[str, str]

uid

Attribute. Unique identifier for API calls using the API key.

Type:

str

base_url

Attribute. Base URL for the NOMIS API.

Type:

str

url

Attribute. Complete URL for API requests.

Type:

str

r

Attribute. Response object from API requests.

Type:

requests.Response

config

Attribute. Loaded configuration details from Consensus.config_utils.load_config(), including API key and proxies.

Type:

dict

__init__(api_key=None, proxies=None)

Initialise ConnectToNomis with API key and proxies.

Parameters:
  • api_key (str) – NOMIS API key. Defaults to None, in which case it loads from the config file.

  • proxies (Dict[str, str]) – Proxy addresses. Defaults to None, in which case it loads from the config file.

Raises:

AssertionError – If no API key is provided or found in the config.

_create_geography_e_code(val)

Create a nine-character GSS code.

Parameters:

val (int) – Value for the GSS code.

Returns:

GSS code in the format ‘Exxxxxxxx’.

Return type:

str

_find_exact_table(table_name)

Find and return the matching table for the given name.

Parameters:

table_name (str) – Name of the table to search for.

Returns:

The matching NOMIS table.

Return type:

Any

_geography_edges(nums)

Find edges in a list of integers to create ranges for geography codes.

Parameters:

nums (List[int]) – List of geographical codes.

Returns:

List of start and end pairs representing ranges.

Return type:

List[Any]

_unpack_geography_list(geographies)

Unpack a list of GSS codes, find edges, and format them for a URL.

Parameters:

geographies (List[str]) – List of geographical codes (as GSS codes).

Returns:

Formatted string for the URL.

Return type:

str

connect(url=None)

Connect to the NOMIS API and fetch table structures.

Parameters:

url (str) – Custom URL for API connection. Defaults to None.

Raises:

KeyError – If proxies are not set and the connection fails without proxies.

Return type:

None

Returns:

None

detailed_info_for_table(table_name)

Print detailed information for a specific table.

Parameters:

table_name (str) – Name of the table to get details for.

Return type:

None

Returns:

None

get_all_tables()

Get all available tables from NOMIS.

Raises:

AssertionError – If the API connection was not successful.

Returns:

List of NOMIS tables.

Return type:

List[Any]

get_table_columns(table_name)

Get the columns of a specific table as a list of tuples.

Parameters:

table_name (str) – Name of the table to get details for.

Returns:

List of tuples of columns and column codes.

Return type:

List[Tuple[str, str]]

print_table_info()

Print brief information for all available tables.

Return type:

None

Returns:

None

url_creator(dataset, params=None, select_columns=None)

Create a URL string for data download from NOMIS.

Parameters:
  • dataset (str) – Name of the dataset to download.

  • params (Dict[str, List[str]]) – Dictionary of query parameters for filtering data. Defaults to None.

  • select_columns (List[str]) – List of columns to select in the API response. Defaults to None.

Raises:

AssertionError – If values for each key of params are not a list

Return type:

None

Returns:

None

class Consensus.Nomis.DownloadFromNomis(*args, **kwargs)

Bases: ConnectToNomis

Wrapper class for downloading data from the NOMIS API.

Inherits from ConnectToNomis() to utilize the NOMIS API for downloading datasets as CSV files or Pandas DataFrames.

api_key

NOMIS API key.

Type:

str

proxies

HTTP and HTTPS proxy addresses as a dictionary {‘http’: http_addr, ‘https’: https_addr}.

Type:

Dict[str, str]

uid

Unique identifier for API calls using the API key.

Type:

str

base_url

Base URL for the NOMIS API.

Type:

str

url

Complete URL for API requests.

Type:

str

r

Response object from API requests.

Type:

requests.Response

config

Loaded configuration details, including API key and proxies.

Type:

dict

__init__(*args, **kwargs)

Initializes the DownloadFromNomis() instance.

_bulk_download_url(dataset

str): Creates a URL for bulk downloading a dataset.

_download_checks(dataset

str, params: Dict[str, List], value_or_percent: str, table_columns: List[str]): Prepares the parameters and URL for downloading data.

table_to_csv(dataset

str, params: Dict[str, List] = None, file_name: str = None, table_columns: List[str] = None, save_location: str = ‘../nomis_download/’, value_or_percent: str = None): Downloads a dataset as a CSV file.

bulk_download(dataset

str, save_location: str = ‘../nomis_download/’): Downloads a dataset as a Pandas DataFrame.

Usage:

Set up API key and proxies:

from Consensus.ConfigManager import ConfigManager
from Consensus.Nomis import DownloadFromNomis
from dotenv import load_dotenv
from pathlib import Path
from os import environ

dotenv_path = Path('.env')
load_dotenv(dotenv_path)
api_key = environ.get("NOMIS_API")
proxy = environ.get("PROXY")

self.conf = ConfigManager()
self.conf.save_config({"nomis_api_key": api_key,
                    "proxies": {"http": proxy,
                                "https": proxy}})

View datasets:

nomis = DownloadFromNomis()
nomis_conn = nomis.connect()
nomis.print_table_info()

For bulk downloads:

nomis = DownloadFromNomis()
nomis_conn = nomis.connect()
nomis.bulk_download('NM_2021_1')

Downloading specific tables:

geography = {'geography': ['E92000001']}
df = self.conn.download('NM_2072_1', params=geography)
__init__(*args, **kwargs)

Initialises the DownloadFromNomis() instance.

Parameters:
  • *args – Variable length argument list passed to the parent class.

  • **kwargs – Arbitrary keyword arguments passed to the parent class.

_bulk_download_url(dataset)

Creates a URL for bulk downloading a dataset.

Parameters:

dataset (str) – The dataset identifier (e.g., NM_2021_1).

Return type:

None

Returns:

None

_create_geography_e_code(val)

Create a nine-character GSS code.

Parameters:

val (int) – Value for the GSS code.

Returns:

GSS code in the format ‘Exxxxxxxx’.

Return type:

str

_download_checks(dataset, params, value_or_percent, table_columns)

Prepares the parameters and URL for downloading data.

Parameters:
  • dataset (str) – The dataset identifier (e.g., NM_2021_1).

  • params (Dict[str, List]) – Dictionary of parameters for the query (e.g., {‘geography’: [‘E00016136’]}). Defaults to None.

  • value_or_percent (str) – Specifies whether to retrieve ‘value’ or ‘percent’ data.

  • table_columns (List[str]) – List of columns to include in the query.

Return type:

None

Returns:

None

_download_file(file_path)

Downloads a file to the specified path.

Parameters:

file_path (Path) – The file path where the downloaded file will be saved.

Return type:

None

Returns:

None

_download_to_pandas()

Downloads data directly into a Pandas DataFrame.

Returns:

The downloaded data as a Pandas DataFrame.

Return type:

pd.DataFrame

_find_exact_table(table_name)

Find and return the matching table for the given name.

Parameters:

table_name (str) – Name of the table to search for.

Returns:

The matching NOMIS table.

Return type:

Any

_geography_edges(nums)

Find edges in a list of integers to create ranges for geography codes.

Parameters:

nums (List[int]) – List of geographical codes.

Returns:

List of start and end pairs representing ranges.

Return type:

List[Any]

_unpack_geography_list(geographies)

Unpack a list of GSS codes, find edges, and format them for a URL.

Parameters:

geographies (List[str]) – List of geographical codes (as GSS codes).

Returns:

Formatted string for the URL.

Return type:

str

bulk_download(dataset, data_format='pandas', save_location='../nomis_download/')

Performs a bulk download of a dataset as either CSV or a Pandas DataFrame.

Parameters:
  • dataset (str) – The dataset identifier (e.g., NM_2021_1).

  • data_format (str) – Format of the downloaded data. Can be ‘csv’, ‘download’, ‘pandas’, or ‘df’. Defaults to ‘pandas’.

  • save_location (str) – Directory to save the CSV file if data_format is ‘csv’. Defaults to ‘../nomis_download/’.

Raises:

AssertionError – If data_format is not in the specified format

Returns:

The downloaded data as a Pandas DataFrame if data_format is ‘pandas’.

Return type:

pd.DataFrame

connect(url=None)

Connect to the NOMIS API and fetch table structures.

Parameters:

url (str) – Custom URL for API connection. Defaults to None.

Raises:

KeyError – If proxies are not set and the connection fails without proxies.

Return type:

None

Returns:

None

detailed_info_for_table(table_name)

Print detailed information for a specific table.

Parameters:

table_name (str) – Name of the table to get details for.

Return type:

None

Returns:

None

download(dataset, params=None, table_columns=None, value_or_percent=None)

Downloads a dataset as a Pandas DataFrame.

Parameters:
  • dataset (str) – The dataset identifier (e.g., NM_2021_1).

  • params (Dict[str, List]) – Dictionary of parameters (e.g., {‘geography’: [‘E00016136’], ‘age’: [0, 2, 3]}). Defaults to None.

  • table_columns (List[str]) – List of columns to include in the dataset. Defaults to None.

  • value_or_percent (str) – Specifies whether to download ‘value’ or ‘percent’. Defaults to None.

Returns:

The downloaded data as a Pandas DataFrame.

Return type:

pd.DataFrame

get_all_tables()

Get all available tables from NOMIS.

Raises:

AssertionError – If the API connection was not successful.

Returns:

List of NOMIS tables.

Return type:

List[Any]

get_table_columns(table_name)

Get the columns of a specific table as a list of tuples.

Parameters:

table_name (str) – Name of the table to get details for.

Returns:

List of tuples of columns and column codes.

Return type:

List[Tuple[str, str]]

print_table_info()

Print brief information for all available tables.

Return type:

None

Returns:

None

table_to_csv(dataset, params=None, file_name=None, table_columns=None, save_location='../nomis_download/', value_or_percent=None)

Downloads a dataset as a CSV file.

Parameters:
  • dataset (str) – The dataset identifier (e.g., NM_2021_1).

  • params (Dict[str, List]) – Dictionary of parameters (e.g., {‘geography’: [‘E00016136’], ‘age’: [0, 2, 3]}). Defaults to None.

  • file_name (str) – Custom name for the saved CSV file. Defaults to None.

  • table_columns (List[str]) – List of columns to include in the dataset. Defaults to None.

  • save_location (str) – Directory to save the downloaded CSV file. Defaults to ‘../nomis_download/’.

  • value_or_percent (str) – Specifies whether to download ‘value’ or ‘percent’. Defaults to None.

Return type:

None

Returns:

None

url_creator(dataset, params=None, select_columns=None)

Create a URL string for data download from NOMIS.

Parameters:
  • dataset (str) – Name of the dataset to download.

  • params (Dict[str, List[str]]) – Dictionary of query parameters for filtering data. Defaults to None.

  • select_columns (List[str]) – List of columns to select in the API response. Defaults to None.

Raises:

AssertionError – If values for each key of params are not a list

Return type:

None

Returns:

None

class Consensus.Nomis.NomisTable(agencyid, annotations, id, components, name, uri, version, description=None)

Bases: object

A dataclass representing a structured output from NOMIS.

This class is designed to encapsulate the metadata and structure of a table retrieved from the NOMIS API. It provides methods for accessing detailed descriptions, annotations, and column information in a readable format.

agencyid

The ID of the agency that owns the table.

Type:

str

annotations

A dictionary containing annotations related to the table.

Type:

Dict[str, Any]

id

The unique identifier of the table.

Type:

str

components

A dictionary containing information about the components (columns) of the table.

Type:

Dict[str, Any]

name

A dictionary containing the name of the table.

Type:

Dict[str, Any]

uri

The URI that links to more information about the table.

Type:

str

version

The version number of the table.

Type:

str

description

An optional description of the table.

Type:

Optional[str]

__init__(agencyid, annotations, id, components, name, uri, version, description=None)
agencyid: str
annotations: Dict[str, Any]
clean_annotations()

Cleans the annotations for more readable presentation and returns them as a list of strings.

Returns:

List of cleaned annotations

Return type:

List[str]

components: Dict[str, Any]
description: Optional[str] = None
detailed_description()

Prints a detailed and cleaned overview of the table, including its ID, description, annotations, and columns.

Return type:

None

Returns:

None

get_table_cols()

Returns a list of tuples, where each tuple contains a column code and its corresponding description.

Returns:

A list of tuples of columns

Return type:

List[Tuple[str, str]]

id: str
name: Dict[str, Any]
table_cols()

Cleans and returns the column information for the table in a readable format.

Returns:

A list of column descriptions as strings

Return type:

List[str]

table_shorthand()

Returns a shorthand description of the table, including its ID and name.

Return type:

None

Returns:

None

uri: str
version: str