Consensus.EsriServers module
This module contains classes that connect to Esri Servers and download data from them. It is designed to be used in conjunction with the Consensus.EsriConnector
module’s FeatureServer()
class.
- class Consensus.EsriServers.OpenGeography(*args, **kwargs)
Bases:
EsriConnector
Open Geography Portal
This module provides a class
OpenGeography()
that connects to the Open Geography Portal API.Running
OpenGeography().build_lookup()
is necessary if you want to make use of theSmartLinker()
class from the GeocodeMerger module.Usage:
This module works in the same way as the
EsriConnector()
class, but it is specifically designed for the Open Geography Portal API. It relis onbuild_lookup()
method that creates a lookup table for the portal’s FeatureServers and saves it to a JSON file.from Consensus.EsriServers import OpenGeography import asyncio async def build_ogp_lookup() ogp = OpenGeography() await ogp.initialise() await ogp.build_lookup() asyncio.run(build_ogp_lookup())
Like with TFL module, you can combine
OpenGeography()
with theFeatureServer()
class to download data from the portal’s FeatureServers.from Consensus.EsriConnector import FeatureServer from Consensus.EsriServers import OpenGeography from Consensus.utils import where_clause_maker import asyncio async def download_test_data(): og = OpenGeography(max_retries=30, retry_delay=2) await og.initialise() fs_service_table = og.service_table fs = FeatureServer() column_name = 'WD23NM' geographic_areas = ['Brockley'] service_name = 'Wards_December_2023_Boundaries_UK_BSC' layers = og.select_layers_by_service(service_name=service_name) # choose the first layer of the 'Wards_December_2023_Boundaries_UK_BSC' service layer_full_name = layers[0].full_name # use the layer's ``full_name`` attribute to select it in ``fs.setup()`` where_clause = where_clause_maker(values=geographic_areas, column=column_name) # a helper function that creates the SQL where clause for Esri Servers await fs.setup(full_name=layer_full_name, service_table=fs_service_table, max_retries=30, retry_delay=2, chunk_size=50) output = await fs.download(where_clause=where_clause, return_geometry=True) print(output) asyncio.run(download_test_data())
However, it is perhaps best to rely on the
SmartLinker()
class from the GeocodeMerger module for more complex operations.- __init__(*args, **kwargs)
Initialise class.
- Parameters:
*args – Arguments for the EsriConnector class.
**kwargs – Keyword arguments for the EsriConnector class.
- Returns:
None
- async _fetch_response(session)
Helper method to fetch the response from the Esri server.
- Parameters:
session (aiohttp.ClientSession) – The aiohttp.ClientSession object.
- Returns:
The JSON response from the Esri server.
- Return type:
Dict
- async _load_all_services()
Load services into a dictionary.
- Return type:
None
- Returns:
None
- async _validate_response()
Validate access to the base URL asynchronously using aiohttp. When a response is received, call
_load_all_services()
to load services into a dictionary.- Return type:
None
- Returns:
None
- async build_lookup(parent_path=PosixPath('/home/runner/work/Consensus/Consensus/Consensus'), included_services=[], replace_old=True)
Build a lookup table from scratch and save it to a JSON file.
- Parameters:
parent_path (Path) – Parent path to save the lookup file.
included_services (List[str]) – List of services to include in the lookup. Defaults to [], which is interpreted as as ‘all’.
replace_old (bool) – Whether to replace the old lookup file. Defaults to True.
- Returns:
The lookup table as a pandas DataFrame.
- Return type:
pd.DataFrame
- async field_matching_condition(field)
Condition for matchable fields. This method is used by
Service()
to filter the fields that are added to the matchable_fields columns, which is subsequently used bySmartLinker()
for matching data tables. This method is meant to be overwritten by the user if they want to change the condition for matchable fields. Each Esri ArcGIS server will have its own rules, so this will be left for the user to deal with. If you are using a built-in server (e.g. TFL or Open Geography Portal), then you don’t have to touch this method.- Parameters:
field (Dict[str, str]) – The field dictionary. This is the input coming from
Service()
.- Returns:
True if the field ends with ‘CD’ or ‘NM’ and the last 4 characters before the end are numeric, or if the field is in the matchable_fields_extension list, False otherwise.
- Return type:
bool
- async get_layer_obj(service, session)
Fetch metadata for a service and add it to the service table.
- Parameters:
service (Dict[str, str]) – Dictionary of services.
session (aiohttp.ClientSession) – The aiohttp.ClientSession object.
- Return type:
None
- Returns:
None
- async initialise()
Run this method to initialise the class session.
- Return type:
None
- Returns:
None
- async metadata_as_pandas(included_services=[])
Asynchronously create a Pandas DataFrame of selected tables’ metadata.
- Parameters:
included_services (List[str]) – A list of service names to include in the DataFrame. If empty, all services are included.
- Returns:
A DataFrame containing the metadata of the selected services.
- Return type:
pd.DataFrame
- print_all_services()
Print name, type, and URL of all services available through Esri server.
- Return type:
None
- Returns:
None
- print_object_data(layer_obj)
Print the data of a Layer object.
- Parameters:
layer_obj (Layer) – The Layer object to print.
- Return type:
None
- Returns:
None
Added in version 1.1.1.
- select_layers_by_layers(layer_name)
Print a subset of the service table.
- Parameters:
layer_name (str) – The name of the layer to print.
- Returns:
A list of Layer objects for the selected service.
- Return type:
List[Any]
Added in version 1.1.0
- select_layers_by_service(service_name)
Print and output a subset of the service table.
- Parameters:
service_name (str) – The name of the service to print.
- Returns:
A list of Layer objects for the selected service.
- Return type:
List[Any]
Added in version 1.1.0
- class Consensus.EsriServers.TFL(*args, **kwargs)
Bases:
EsriConnector
—
This module contains the TFL class, which is a subclass of EsriConnector(). It is used to connect to the TfL Open Data Hub and retrieve data.
Usage:
from Consensus.EsriServers import TFL import asyncio async def print_all(): tfl = TFL(max_retries=30, retry_delay=2) await tfl.initialise() # initialise the connection tfl.print_all_services() # a method to help you choose which service you'd like to download data for. asyncio.run(print_all())
The above code will connect to the TfL Open Data Hub and print all available services. You select the service you want to connect to by copying the service name string that comes after “Service name:” in the output.
Let’s say you want to view the bus stops data and explore the metadata:
from Consensus.EsriServers import TFL import asyncio async def minimal(): tfl = TFL(max_retries=30, retry_delay=2) await tfl.initialise() metadata = await tfl.metadata_as_pandas(included_services=['Bus_Stops']) print(metadata) asyncio.run(minimal())
This will connect to the TfL Open Data Hub and retrieve all available data for Bus_Stops service. From here, you can create a where clause to further fine-tune your query:
from Consensus.EsriServers import TFL from Consensus.utils import where_clause_maker import asyncio async def download_test_data(): tfl = TFL(max_retries=30, retry_delay=2) await tfl.initialise() fs_service_table = tfl.service_table fs = FeatureServer() service_name = 'Bus_Stops' layers = tfl.select_layers_by_service(service_name=service_name) # choose the first layer of the 'Bus_Stops' service layer_full_name = layers[0].full_name # use the layer's ``full_name`` attribute to select it in ``fs.setup()`` and when creating the ``where_clause`` column_name = 'STOP_NAME' geographic_areas = ['Hazel Mead'] where_clause = where_clause_maker(values=geographic_areas, column=column_name) # a helper function that creates the SQL where clause for Esri Servers await fs.setup(full_name=layer_full_name, service_table=fs_service_table, max_retries=30, retry_delay=2, chunk_size=50) output = await fs.download(where_clause=where_clause, return_geometry=True) print(output) asyncio.run(download_test_data())
- __init__(*args, **kwargs)
Initialise class.
- Parameters:
*args – Arguments for the EsriConnector class.
**kwargs – Keyword arguments for the EsriConnector class.
- Returns:
None
- async _fetch_response(session)
Helper method to fetch the response from the Esri server.
- Parameters:
session (aiohttp.ClientSession) – The aiohttp.ClientSession object.
- Returns:
The JSON response from the Esri server.
- Return type:
Dict
- async _load_all_services()
Load services into a dictionary.
- Return type:
None
- Returns:
None
- async _validate_response()
Validate access to the base URL asynchronously using aiohttp. When a response is received, call
_load_all_services()
to load services into a dictionary.- Return type:
None
- Returns:
None
- async build_lookup(parent_path=PosixPath('/home/runner/work/Consensus/Consensus/Consensus'), included_services=[], replace_old=True)
Build a lookup table from scratch and save it to a JSON file.
- Parameters:
parent_path (Path) – Parent path to save the lookup file.
included_services (List[str]) – List of services to include in the lookup. Defaults to [], which is interpreted as as ‘all’.
replace_old (bool) – Whether to replace the old lookup file. Defaults to True.
- Returns:
The lookup table as a pandas DataFrame.
- Return type:
pd.DataFrame
- async field_matching_condition(field)
Condition for matchable fields. This method is used by
Service()
to filter the fields that are added to the matchable_fields columns, which is subsequently used bySmartLinker()
for matching data tables. This method is meant to be overwritten by the user if they want to change the condition for matchable fields. Each Esri ArcGIS server will have its own rules, so this will be left for the user to deal with. If you are using a built-in server (e.g. TFL or Open Geography Portal), then you don’t have to touch this method.The current implementation for this class will return no columns other than the columns listed in matchable_fields_extension.
- Parameters:
field (Dict[str, str]) – The field dictionary. This is the input coming from
Service()
.- Returns:
True if the field ends with ‘CD’ or ‘NM’ and the last 4 characters before the end are numeric, or if the field is in the matchable_fields_extension list, False otherwise.
- Return type:
bool
- async get_layer_obj(service, session)
Fetch metadata for a service and add it to the service table.
- Parameters:
service (Dict[str, str]) – Dictionary of services.
session (aiohttp.ClientSession) – The aiohttp.ClientSession object.
- Return type:
None
- Returns:
None
- async initialise()
Run this method to initialise the class session.
- Return type:
None
- Returns:
None
- async metadata_as_pandas(included_services=[])
Asynchronously create a Pandas DataFrame of selected tables’ metadata.
- Parameters:
included_services (List[str]) – A list of service names to include in the DataFrame. If empty, all services are included.
- Returns:
A DataFrame containing the metadata of the selected services.
- Return type:
pd.DataFrame
- print_all_services()
Print name, type, and URL of all services available through Esri server.
- Return type:
None
- Returns:
None
- print_object_data(layer_obj)
Print the data of a Layer object.
- Parameters:
layer_obj (Layer) – The Layer object to print.
- Return type:
None
- Returns:
None
Added in version 1.1.1.
- select_layers_by_layers(layer_name)
Print a subset of the service table.
- Parameters:
layer_name (str) – The name of the layer to print.
- Returns:
A list of Layer objects for the selected service.
- Return type:
List[Any]
Added in version 1.1.0
- select_layers_by_service(service_name)
Print and output a subset of the service table.
- Parameters:
service_name (str) – The name of the service to print.
- Returns:
A list of Layer objects for the selected service.
- Return type:
List[Any]
Added in version 1.1.0