Consensus.EsriServers module

This module contains classes that connect to Esri Servers and download data from them. It is designed to be used in conjunction with the Consensus.EsriConnector module’s FeatureServer() class.

class Consensus.EsriServers.OpenGeography(*args, **kwargs)

Bases: EsriConnector

Open Geography Portal

This module provides a class OpenGeography() that connects to the Open Geography Portal API.

Running OpenGeography().build_lookup() is necessary if you want to make use of the SmartLinker() class from the GeocodeMerger module.

Usage:

This module works in the same way as the EsriConnector() class, but it is specifically designed for the Open Geography Portal API. It relis on build_lookup() method that creates a lookup table for the portal’s FeatureServers and saves it to a JSON file.

from Consensus.EsriServers import OpenGeography
import asyncio

async def build_ogp_lookup()
    ogp = OpenGeography()
    await ogp.initialise()
    await ogp.build_lookup()

asyncio.run(build_ogp_lookup())

Like with TFL module, you can combine OpenGeography() with the FeatureServer() class to download data from the portal’s FeatureServers.

from Consensus.EsriConnector import FeatureServer
from Consensus.EsriServers import OpenGeography
from Consensus.utils import where_clause_maker
import asyncio

async def download_test_data():
    og = OpenGeography(max_retries=30, retry_delay=2)
    await og.initialise()

    fs_service_table = og.service_table
    fs = FeatureServer()

    column_name = 'WD23NM'
    geographic_areas = ['Brockley']
    service_name = 'Wards_December_2023_Boundaries_UK_BSC'
    layers = og.select_layers_by_service(service_name=service_name)  # choose the first layer of the 'Wards_December_2023_Boundaries_UK_BSC' service
    layer_full_name = layers[0].full_name  # use the layer's ``full_name`` attribute to select it in ``fs.setup()``

    where_clause = where_clause_maker(values=geographic_areas, column=column_name)  # a helper function that creates the SQL where clause for Esri Servers

    await fs.setup(full_name=layer_full_name, service_table=fs_service_table, max_retries=30, retry_delay=2, chunk_size=50)
    output = await fs.download(where_clause=where_clause, return_geometry=True)
    print(output)

asyncio.run(download_test_data())

However, it is perhaps best to rely on the SmartLinker() class from the GeocodeMerger module for more complex operations.

__init__(*args, **kwargs)

Initialise class.

Parameters:
  • *args – Arguments for the EsriConnector class.

  • **kwargs – Keyword arguments for the EsriConnector class.

Returns:

None

async _fetch_response(session)

Helper method to fetch the response from the Esri server.

Parameters:

session (aiohttp.ClientSession) – The aiohttp.ClientSession object.

Returns:

The JSON response from the Esri server.

Return type:

Dict

async _load_all_services()

Load services into a dictionary.

Return type:

None

Returns:

None

async _validate_response()

Validate access to the base URL asynchronously using aiohttp. When a response is received, call _load_all_services() to load services into a dictionary.

Return type:

None

Returns:

None

async build_lookup(parent_path=PosixPath('/home/runner/work/Consensus/Consensus/Consensus'), included_services=[], replace_old=True)

Build a lookup table from scratch and save it to a JSON file.

Parameters:
  • parent_path (Path) – Parent path to save the lookup file.

  • included_services (List[str]) – List of services to include in the lookup. Defaults to [], which is interpreted as as ‘all’.

  • replace_old (bool) – Whether to replace the old lookup file. Defaults to True.

Returns:

The lookup table as a pandas DataFrame.

Return type:

pd.DataFrame

async field_matching_condition(field)

Condition for matchable fields. This method is used by Service() to filter the fields that are added to the matchable_fields columns, which is subsequently used by SmartLinker() for matching data tables. This method is meant to be overwritten by the user if they want to change the condition for matchable fields. Each Esri ArcGIS server will have its own rules, so this will be left for the user to deal with. If you are using a built-in server (e.g. TFL or Open Geography Portal), then you don’t have to touch this method.

Parameters:

field (Dict[str, str]) – The field dictionary. This is the input coming from Service().

Returns:

True if the field ends with ‘CD’ or ‘NM’ and the last 4 characters before the end are numeric, or if the field is in the matchable_fields_extension list, False otherwise.

Return type:

bool

async get_layer_obj(service, session)

Fetch metadata for a service and add it to the service table.

Parameters:
  • service (Dict[str, str]) – Dictionary of services.

  • session (aiohttp.ClientSession) – The aiohttp.ClientSession object.

Return type:

None

Returns:

None

async initialise()

Run this method to initialise the class session.

Return type:

None

Returns:

None

async metadata_as_pandas(included_services=[])

Asynchronously create a Pandas DataFrame of selected tables’ metadata.

Parameters:

included_services (List[str]) – A list of service names to include in the DataFrame. If empty, all services are included.

Returns:

A DataFrame containing the metadata of the selected services.

Return type:

pd.DataFrame

print_all_services()

Print name, type, and URL of all services available through Esri server.

Return type:

None

Returns:

None

print_object_data(layer_obj)

Print the data of a Layer object.

Parameters:

layer_obj (Layer) – The Layer object to print.

Return type:

None

Returns:

None

Added in version 1.1.1.

select_layers_by_layers(layer_name)

Print a subset of the service table.

Parameters:

layer_name (str) – The name of the layer to print.

Returns:

A list of Layer objects for the selected service.

Return type:

List[Any]

Added in version 1.1.0

select_layers_by_service(service_name)

Print and output a subset of the service table.

Parameters:

service_name (str) – The name of the service to print.

Returns:

A list of Layer objects for the selected service.

Return type:

List[Any]

Added in version 1.1.0

class Consensus.EsriServers.TFL(*args, **kwargs)

Bases: EsriConnector

This module contains the TFL class, which is a subclass of EsriConnector(). It is used to connect to the TfL Open Data Hub and retrieve data.

Usage:

from Consensus.EsriServers import TFL
import asyncio

async def print_all():
    tfl = TFL(max_retries=30, retry_delay=2)
    await tfl.initialise()  # initialise the connection
    tfl.print_all_services()  # a method to help you choose which service you'd like to download data for.

asyncio.run(print_all())

The above code will connect to the TfL Open Data Hub and print all available services. You select the service you want to connect to by copying the service name string that comes after “Service name:” in the output.

Let’s say you want to view the bus stops data and explore the metadata:

from Consensus.EsriServers import TFL
import asyncio

async def minimal():
    tfl = TFL(max_retries=30, retry_delay=2)
    await tfl.initialise()
    metadata = await tfl.metadata_as_pandas(included_services=['Bus_Stops'])
    print(metadata)

asyncio.run(minimal())

This will connect to the TfL Open Data Hub and retrieve all available data for Bus_Stops service. From here, you can create a where clause to further fine-tune your query:

from Consensus.EsriServers import TFL
from Consensus.utils import where_clause_maker
import asyncio

async def download_test_data():
    tfl = TFL(max_retries=30, retry_delay=2)
    await tfl.initialise()

    fs_service_table = tfl.service_table
    fs = FeatureServer()

    service_name = 'Bus_Stops'
    layers = tfl.select_layers_by_service(service_name=service_name)  # choose the first layer of the 'Bus_Stops' service
    layer_full_name = layers[0].full_name  # use the layer's ``full_name`` attribute to select it in ``fs.setup()`` and when creating the ``where_clause``


    column_name = 'STOP_NAME'
    geographic_areas = ['Hazel Mead']
    where_clause = where_clause_maker(values=geographic_areas, column=column_name)  # a helper function that creates the SQL where clause for Esri Servers

    await fs.setup(full_name=layer_full_name, service_table=fs_service_table, max_retries=30, retry_delay=2, chunk_size=50)
    output = await fs.download(where_clause=where_clause, return_geometry=True)
    print(output)

asyncio.run(download_test_data())
__init__(*args, **kwargs)

Initialise class.

Parameters:
  • *args – Arguments for the EsriConnector class.

  • **kwargs – Keyword arguments for the EsriConnector class.

Returns:

None

async _fetch_response(session)

Helper method to fetch the response from the Esri server.

Parameters:

session (aiohttp.ClientSession) – The aiohttp.ClientSession object.

Returns:

The JSON response from the Esri server.

Return type:

Dict

async _load_all_services()

Load services into a dictionary.

Return type:

None

Returns:

None

async _validate_response()

Validate access to the base URL asynchronously using aiohttp. When a response is received, call _load_all_services() to load services into a dictionary.

Return type:

None

Returns:

None

async build_lookup(parent_path=PosixPath('/home/runner/work/Consensus/Consensus/Consensus'), included_services=[], replace_old=True)

Build a lookup table from scratch and save it to a JSON file.

Parameters:
  • parent_path (Path) – Parent path to save the lookup file.

  • included_services (List[str]) – List of services to include in the lookup. Defaults to [], which is interpreted as as ‘all’.

  • replace_old (bool) – Whether to replace the old lookup file. Defaults to True.

Returns:

The lookup table as a pandas DataFrame.

Return type:

pd.DataFrame

async field_matching_condition(field)

Condition for matchable fields. This method is used by Service() to filter the fields that are added to the matchable_fields columns, which is subsequently used by SmartLinker() for matching data tables. This method is meant to be overwritten by the user if they want to change the condition for matchable fields. Each Esri ArcGIS server will have its own rules, so this will be left for the user to deal with. If you are using a built-in server (e.g. TFL or Open Geography Portal), then you don’t have to touch this method.

The current implementation for this class will return no columns other than the columns listed in matchable_fields_extension.

Parameters:

field (Dict[str, str]) – The field dictionary. This is the input coming from Service().

Returns:

True if the field ends with ‘CD’ or ‘NM’ and the last 4 characters before the end are numeric, or if the field is in the matchable_fields_extension list, False otherwise.

Return type:

bool

async get_layer_obj(service, session)

Fetch metadata for a service and add it to the service table.

Parameters:
  • service (Dict[str, str]) – Dictionary of services.

  • session (aiohttp.ClientSession) – The aiohttp.ClientSession object.

Return type:

None

Returns:

None

async initialise()

Run this method to initialise the class session.

Return type:

None

Returns:

None

async metadata_as_pandas(included_services=[])

Asynchronously create a Pandas DataFrame of selected tables’ metadata.

Parameters:

included_services (List[str]) – A list of service names to include in the DataFrame. If empty, all services are included.

Returns:

A DataFrame containing the metadata of the selected services.

Return type:

pd.DataFrame

print_all_services()

Print name, type, and URL of all services available through Esri server.

Return type:

None

Returns:

None

print_object_data(layer_obj)

Print the data of a Layer object.

Parameters:

layer_obj (Layer) – The Layer object to print.

Return type:

None

Returns:

None

Added in version 1.1.1.

select_layers_by_layers(layer_name)

Print a subset of the service table.

Parameters:

layer_name (str) – The name of the layer to print.

Returns:

A list of Layer objects for the selected service.

Return type:

List[Any]

Added in version 1.1.0

select_layers_by_service(service_name)

Print and output a subset of the service table.

Parameters:

service_name (str) – The name of the service to print.

Returns:

A list of Layer objects for the selected service.

Return type:

List[Any]

Added in version 1.1.0