Consensus.Nomis module
API keys and connecting to NOMIS
Get a NOMIS api key by registering with NOMIS <https://www.nomisweb.co.uk>. When initialising the DownloadFromNomis class, provide the api key as a parameter to the api_key argument. If you need proxies to access the data, provide the information as a dictionary to proxies:
api_key = '02bdlfsjkd3idk32j3jeaasd2' # this is an example of an API key
proxies = {'http': your_proxy_address, 'https': your_proxy_address} # proxy dictionary must follow this pattern. If you only have http proxy, copy it to the https without changing it
nomis = DownloadFromNomis(api_key=api_key, proxies=proxies)
nomis.connect()
Alternatively, you can use the ConfigManager to store API keys:
dotenv_path = Path('.env')
load_dotenv(dotenv_path)
api_key = environ.get("NOMIS_API")
proxy = environ.get("PROXY")
conf = ConfigManager()
conf.save_config({"nomis_api_key": api_key, "proxies.http": proxy, "proxies.https": proxy})
Example usage
from Consensus.Nomis import DownloadFromNomis
from Consensus.ConfigManager import ConfigManager
from dotenv import load_dotenv
from pathlib import Path
from os import environ
# get your API keys and proxy settings from .env file
dotenv_path = Path('.env') # assuming .env file is in your working directory
load_dotenv(dotenv_path)
api_key = environ.get("NOMIS_API") # assuming you've saved the API key to a variable called NOMIS_API
proxy = environ.get("PROXY") # assuming you've saved the proxy address to a variable called PROXY
# set up your config.json file - only necessary the first time you use the package
config = {
"nomis_api_key": api_key, # the key for NOMIS must be 'nomis_api_key'
"proxies.http": proxy, # you may not need to provide anything for proxy
"proxies.https": proxy # the http and https proxies can be different if your setup requires it
}
conf = ConfigManager()
conf.save_config()
# establish connection
nomis = DownloadFromNomis()
nomis.connect()
# print all tables from NOMIS
nomis.print_table_info()
# Get more detailed information about a specific table. Use the string starting with NM_* when selecting a table.
# In this case, we choose TS054 - Tenure from Census 2021:
nomis.detailed_info_for_table('NM_2072_1') # TS054 - Tenure
# If you want the data for the whole country:
df_bulk = nomis.bulk_download('NM_2072_1')
print(df_bulk)
# And if you want just an extract for a specific geography, in our case England:
geography = {'geography': ['E92000001']} # you can extend this list
df_england = nomis.download('NM_2072_1', params=geography)
print(df_england)
- class Consensus.Nomis.ConnectToNomis(api_key=None, proxies=None)
Bases:
object
Class to connect and retrieve data from the NOMIS API.
- api_key
Attribute. NOMIS API key.
- Type:
str
- proxies
Attribute. HTTP and HTTPS proxy addresses as a dictionary {‘http’: http_addr, ‘https’: https_addr}.
- Type:
Dict[str, str]
- uid
Attribute. Unique identifier for API calls using the API key.
- Type:
str
- base_url
Attribute. Base URL for the NOMIS API.
- Type:
str
- url
Attribute. Complete URL for API requests.
- Type:
str
- r
Attribute. Response object from API requests.
- Type:
requests.Response
- config
Attribute. Loaded configuration details from Consensus.config_utils.load_config(), including API key and proxies.
- Type:
dict
- __init__(api_key=None, proxies=None)
Initialise ConnectToNomis with API key and proxies.
- Parameters:
api_key (str) – NOMIS API key. Defaults to None, in which case it loads from the config file.
proxies (Dict[str, str]) – Proxy addresses. Defaults to None, in which case it loads from the config file.
- Raises:
AssertionError – If no API key is provided or found in the config.
- _create_geography_e_code(val)
Create a nine-character GSS code.
- Parameters:
val (int) – Value for the GSS code.
- Returns:
GSS code in the format ‘Exxxxxxxx’.
- Return type:
str
- _find_exact_table(table_name)
Find and return the matching table for the given name.
- Parameters:
table_name (str) – Name of the table to search for.
- Returns:
The matching NOMIS table.
- Return type:
Any
- _geography_edges(nums)
Find edges in a list of integers to create ranges for geography codes.
- Parameters:
nums (List[int]) – List of geographical codes.
- Returns:
List of start and end pairs representing ranges.
- Return type:
List[Any]
- _unpack_geography_list(geographies)
Unpack a list of GSS codes, find edges, and format them for a URL.
- Parameters:
geographies (List[str]) – List of geographical codes (as GSS codes).
- Returns:
Formatted string for the URL.
- Return type:
str
- connect(url=None)
Connect to the NOMIS API and fetch table structures.
- Parameters:
url (str) – Custom URL for API connection. Defaults to None.
- Raises:
KeyError – If proxies are not set and the connection fails without proxies.
- Return type:
None
- Returns:
None
- detailed_info_for_table(table_name)
Print detailed information for a specific table.
- Parameters:
table_name (str) – Name of the table to get details for.
- Return type:
None
- Returns:
None
- get_all_tables()
Get all available tables from NOMIS.
- Raises:
AssertionError – If the API connection was not successful.
- Returns:
List of NOMIS tables.
- Return type:
List[Any]
- get_table_columns(table_name)
Get the columns of a specific table as a list of tuples.
- Parameters:
table_name (str) – Name of the table to get details for.
- Returns:
List of tuples of columns and column codes.
- Return type:
List[Tuple[str, str]]
- print_table_info()
Print brief information for all available tables.
- Return type:
None
- Returns:
None
- url_creator(dataset, params=None, select_columns=None)
Create a URL string for data download from NOMIS.
- Parameters:
dataset (str) – Name of the dataset to download.
params (Dict[str, List[str]]) – Dictionary of query parameters for filtering data. Defaults to None.
select_columns (List[str]) – List of columns to select in the API response. Defaults to None.
- Raises:
AssertionError – If values for each key of params are not a list
- Return type:
None
- Returns:
None
- class Consensus.Nomis.DownloadFromNomis(*args, **kwargs)
Bases:
ConnectToNomis
Wrapper class for downloading data from the NOMIS API.
Inherits from
ConnectToNomis()
to utilize the NOMIS API for downloading datasets as CSV files or Pandas DataFrames.- api_key
NOMIS API key.
- Type:
str
- proxies
HTTP and HTTPS proxy addresses as a dictionary {‘http’: http_addr, ‘https’: https_addr}.
- Type:
Dict[str, str]
- uid
Unique identifier for API calls using the API key.
- Type:
str
- base_url
Base URL for the NOMIS API.
- Type:
str
- url
Complete URL for API requests.
- Type:
str
- r
Response object from API requests.
- Type:
requests.Response
- config
Loaded configuration details, including API key and proxies.
- Type:
dict
- __init__(*args, **kwargs)
Initializes the
DownloadFromNomis()
instance.
- _bulk_download_url(dataset
str): Creates a URL for bulk downloading a dataset.
- _download_checks(dataset
str, params: Dict[str, List], value_or_percent: str, table_columns: List[str]): Prepares the parameters and URL for downloading data.
- table_to_csv(dataset
str, params: Dict[str, List] = None, file_name: str = None, table_columns: List[str] = None, save_location: str = ‘../nomis_download/’, value_or_percent: str = None): Downloads a dataset as a CSV file.
- bulk_download(dataset
str, save_location: str = ‘../nomis_download/’): Downloads a dataset as a Pandas DataFrame.
Usage:
Set up API key and proxies:
from Consensus.ConfigManager import ConfigManager from Consensus.Nomis import DownloadFromNomis from dotenv import load_dotenv from pathlib import Path from os import environ dotenv_path = Path('.env') load_dotenv(dotenv_path) api_key = environ.get("NOMIS_API") proxy = environ.get("PROXY") self.conf = ConfigManager() self.conf.save_config({"nomis_api_key": api_key, "proxies": {"http": proxy, "https": proxy}})
View datasets:
nomis = DownloadFromNomis() nomis_conn = nomis.connect() nomis.print_table_info()
For bulk downloads:
nomis = DownloadFromNomis() nomis_conn = nomis.connect() nomis.bulk_download('NM_2021_1')
Downloading specific tables:
geography = {'geography': ['E92000001']} df = self.conn.download('NM_2072_1', params=geography)
- __init__(*args, **kwargs)
Initialises the
DownloadFromNomis()
instance.- Parameters:
*args – Variable length argument list passed to the parent class.
**kwargs – Arbitrary keyword arguments passed to the parent class.
- _bulk_download_url(dataset)
Creates a URL for bulk downloading a dataset.
- Parameters:
dataset (str) – The dataset identifier (e.g., NM_2021_1).
- Return type:
None
- Returns:
None
- _create_geography_e_code(val)
Create a nine-character GSS code.
- Parameters:
val (int) – Value for the GSS code.
- Returns:
GSS code in the format ‘Exxxxxxxx’.
- Return type:
str
- _download_checks(dataset, params, value_or_percent, table_columns)
Prepares the parameters and URL for downloading data.
- Parameters:
dataset (str) – The dataset identifier (e.g., NM_2021_1).
params (Dict[str, List]) – Dictionary of parameters for the query (e.g., {‘geography’: [‘E00016136’]}). Defaults to None.
value_or_percent (str) – Specifies whether to retrieve ‘value’ or ‘percent’ data.
table_columns (List[str]) – List of columns to include in the query.
- Return type:
None
- Returns:
None
- _download_file(file_path)
Downloads a file to the specified path.
- Parameters:
file_path (Path) – The file path where the downloaded file will be saved.
- Return type:
None
- Returns:
None
- _download_to_pandas()
Downloads data directly into a Pandas DataFrame.
- Returns:
The downloaded data as a Pandas DataFrame.
- Return type:
pd.DataFrame
- _find_exact_table(table_name)
Find and return the matching table for the given name.
- Parameters:
table_name (str) – Name of the table to search for.
- Returns:
The matching NOMIS table.
- Return type:
Any
- _geography_edges(nums)
Find edges in a list of integers to create ranges for geography codes.
- Parameters:
nums (List[int]) – List of geographical codes.
- Returns:
List of start and end pairs representing ranges.
- Return type:
List[Any]
- _unpack_geography_list(geographies)
Unpack a list of GSS codes, find edges, and format them for a URL.
- Parameters:
geographies (List[str]) – List of geographical codes (as GSS codes).
- Returns:
Formatted string for the URL.
- Return type:
str
- bulk_download(dataset, data_format='pandas', save_location='../nomis_download/')
Performs a bulk download of a dataset as either CSV or a Pandas DataFrame.
- Parameters:
dataset (str) – The dataset identifier (e.g., NM_2021_1).
data_format (str) – Format of the downloaded data. Can be ‘csv’, ‘download’, ‘pandas’, or ‘df’. Defaults to ‘pandas’.
save_location (str) – Directory to save the CSV file if data_format is ‘csv’. Defaults to ‘../nomis_download/’.
- Raises:
AssertionError – If data_format is not in the specified format
- Returns:
The downloaded data as a Pandas DataFrame if data_format is ‘pandas’.
- Return type:
pd.DataFrame
- connect(url=None)
Connect to the NOMIS API and fetch table structures.
- Parameters:
url (str) – Custom URL for API connection. Defaults to None.
- Raises:
KeyError – If proxies are not set and the connection fails without proxies.
- Return type:
None
- Returns:
None
- detailed_info_for_table(table_name)
Print detailed information for a specific table.
- Parameters:
table_name (str) – Name of the table to get details for.
- Return type:
None
- Returns:
None
- download(dataset, params=None, table_columns=None, value_or_percent=None)
Downloads a dataset as a Pandas DataFrame.
- Parameters:
dataset (str) – The dataset identifier (e.g., NM_2021_1).
params (Dict[str, List]) – Dictionary of parameters (e.g., {‘geography’: [‘E00016136’], ‘age’: [0, 2, 3]}). Defaults to None.
table_columns (List[str]) – List of columns to include in the dataset. Defaults to None.
value_or_percent (str) – Specifies whether to download ‘value’ or ‘percent’. Defaults to None.
- Returns:
The downloaded data as a Pandas DataFrame.
- Return type:
pd.DataFrame
- get_all_tables()
Get all available tables from NOMIS.
- Raises:
AssertionError – If the API connection was not successful.
- Returns:
List of NOMIS tables.
- Return type:
List[Any]
- get_table_columns(table_name)
Get the columns of a specific table as a list of tuples.
- Parameters:
table_name (str) – Name of the table to get details for.
- Returns:
List of tuples of columns and column codes.
- Return type:
List[Tuple[str, str]]
- print_table_info()
Print brief information for all available tables.
- Return type:
None
- Returns:
None
- table_to_csv(dataset, params=None, file_name=None, table_columns=None, save_location='../nomis_download/', value_or_percent=None)
Downloads a dataset as a CSV file.
- Parameters:
dataset (str) – The dataset identifier (e.g., NM_2021_1).
params (Dict[str, List]) – Dictionary of parameters (e.g., {‘geography’: [‘E00016136’], ‘age’: [0, 2, 3]}). Defaults to None.
file_name (str) – Custom name for the saved CSV file. Defaults to None.
table_columns (List[str]) – List of columns to include in the dataset. Defaults to None.
save_location (str) – Directory to save the downloaded CSV file. Defaults to ‘../nomis_download/’.
value_or_percent (str) – Specifies whether to download ‘value’ or ‘percent’. Defaults to None.
- Return type:
None
- Returns:
None
- url_creator(dataset, params=None, select_columns=None)
Create a URL string for data download from NOMIS.
- Parameters:
dataset (str) – Name of the dataset to download.
params (Dict[str, List[str]]) – Dictionary of query parameters for filtering data. Defaults to None.
select_columns (List[str]) – List of columns to select in the API response. Defaults to None.
- Raises:
AssertionError – If values for each key of params are not a list
- Return type:
None
- Returns:
None
- class Consensus.Nomis.NomisTable(agencyid, annotations, id, components, name, uri, version, description=None)
Bases:
object
A dataclass representing a structured output from NOMIS.
This class is designed to encapsulate the metadata and structure of a table retrieved from the NOMIS API. It provides methods for accessing detailed descriptions, annotations, and column information in a readable format.
- agencyid
The ID of the agency that owns the table.
- Type:
str
- annotations
A dictionary containing annotations related to the table.
- Type:
Dict[str, Any]
- id
The unique identifier of the table.
- Type:
str
- components
A dictionary containing information about the components (columns) of the table.
- Type:
Dict[str, Any]
- name
A dictionary containing the name of the table.
- Type:
Dict[str, Any]
- uri
The URI that links to more information about the table.
- Type:
str
- version
The version number of the table.
- Type:
str
- description
An optional description of the table.
- Type:
Optional[str]
- __init__(agencyid, annotations, id, components, name, uri, version, description=None)
-
agencyid:
str
-
annotations:
Dict
[str
,Any
]
- clean_annotations()
Cleans the annotations for more readable presentation and returns them as a list of strings.
- Returns:
List of cleaned annotations
- Return type:
List[str]
-
components:
Dict
[str
,Any
]
-
description:
Optional
[str
] = None
- detailed_description()
Prints a detailed and cleaned overview of the table, including its ID, description, annotations, and columns.
- Return type:
None
- Returns:
None
- get_table_cols()
Returns a list of tuples, where each tuple contains a column code and its corresponding description.
- Returns:
A list of tuples of columns
- Return type:
List[Tuple[str, str]]
-
id:
str
-
name:
Dict
[str
,Any
]
- table_cols()
Cleans and returns the column information for the table in a readable format.
- Returns:
A list of column descriptions as strings
- Return type:
List[str]
- table_shorthand()
Returns a shorthand description of the table, including its ID and name.
- Return type:
None
- Returns:
None
-
uri:
str
-
version:
str