16.4. Connecting and Editing Data Across Layers
Ability to connect data from different layers is one of the duties of a GIS software. Such a connection can be based on the spatial relationship between the features, or on their shared attributes. QGIS provides tools to handle any of these associations, such as:
Processing algorithms that can create a new layer as a result of the connection, namely Join attributes by location, Join attributes by nearest, Join attributes by field value, …
SQL queries to create a new layer from the DB Manager or as a virtual layer
Joins properties or relations settings that temporarily extend attributes of features in a given layer, with those of features in another layer based on some matching attribute(s).
Joins and relations are technical concepts borrowed from databases to get the most out of data stored in tables by combining their contents. The idea is that features (rows) of different layers (tables) can be associated to each other. The number of rows which are matching each other can be of any value (zero, one, many).
16.4.1. Joining features between two layers
Joins in QGIS allow you to associate features of the current layer to features from another loaded vector layer. Whether they are spatially enabled and the type of geometry do not matter. The join is based on an attribute that is shared by the layers, in a one-to-one relationship.
To create a join on a layer (identified below as target layer
):
Click the Add new join button. The Add vector join dialog appears.
Select the Join layer you want to connect with the target vector layer
Specify the Join field (from the
join layer
) and the Target field (from thetarget layer
). These are the fields that are used to find matching feature in both layers hence they should have values in common.Press OK and a summary of selected parameters is added to the Join panel.
The steps above will create a join, where ALL the attributes of the first matching feature in the join layer is added to the target layer’s feature. The following logic is used to pair features during a join process:
All the features in the target layer are returned, regardless they have a match
If the target field contains duplicate values, these features are assigned the same feature from the join layer.
If the join field contains duplicate matching values, only the first fetched feature is picked.
Note
Joins in QGIS are based on a single field matching so most of the times, you would want to make sure that values in the matchable fields are unique.
QGIS provides some more options to tweak the join:
Cache join layer in virtual memory: allows you to cache values in memory (without geometries) from the joined layer in order to speed up lookups.
Create attribute index on the join field to speed up lookups
Dynamic form: helps to synchronize join fields on the fly, according to the Target field. This way, constraints for join fields are also correctly updated. Note that it’s deactivated by default because it may be very time consuming if you have a lot of features or a myriad of joins.
If the target layer is editable, then some icons will be displayed in the attribute table next to fields, in order to inform about their status:
: the join layer is not configured to be editable. If you want to be able to edit join features from the target attribute table, then you have to check the option Editable join layer.
: the join layer is well configured to be editable, but its current status is read only.
: the join layer is editable, but synchronization mechanisms are not activated. If you want to automatically add a feature in the join layer when a feature is created in the target layer, then you have to check the option Upsert on edit. Symmetrically, the option Delete cascade may be activated if you want to automatically delete join features.
Joined fields: instead of adding all the fields from the joined layer, you can specify a subset.
Custom field name prefix for joined fields, in order to avoid name collision
16.4.2. Setting relations between multiple layers
Unlike joins that define a one-to-one link between features across two layers, relations help you build interconnections between multiple features across two or more layers. As such, relations are project level settings and are set in Relations tab. From there, you can:
Add relation whose type can be:
polymorphic relation that you can add or edit with the dedicated tools in the action drop-down menu.
Note
There is no simple way yet to edit a non-polymorphic relation once it has been created. Only the name can be edited with a double-click. For any other parameters of such a relation you will have to remove and recreate it.
Discover relations: QGIS is able to discover existing relations from supported database formats (PostgreSQL, GeoPackage, ESRI File Geodatabase, …). This can be a good way to ease the relations definition.
16.4.2.1. One to many (1-N) relations
As an example you have a layer with all regions of Alaska (polygon) which provides some attributes about its name and region type and a unique id (which acts as primary key).
Then you get another point layer or table with information about airports that are located in the regions and you also want to keep track of these. If you want to add them to the regions layer, you need to create a one to many relation using foreign keys, because there are several airports in most regions.
Layers and keys
QGIS makes no difference between a table and a vector layer.
Basically, a vector layer is a table with a geometry.
So you can add your table as a vector layer.
To demonstrate the 1-n relation, you can load the regions
and airports
layers in the sample dataset.
In practice, each airport belongs to exactly one region
while each region can have any number of airports (a typical one to many relation).
which has a foreign key field (fk_region
) to the layer regions.
In addition to the attributes describing the airports,
the aiports layer has another field fk_region
which acts as a foreign key
(if you have a database, you will probably want to define a constraint on it).
This fk_region
field will always contain an id of a region.
It can be seen like a pointer to the region it belongs to.
All you have to do is to tell QGIS the relation between the layers so that you can design a custom edit form for editing and QGIS takes care of the setup. It works with different providers (so you can also use it with shape and csv files).
Defining 1-N relations
The first thing we are going to do is to let QGIS know about the relations between the layers. This is done in Relations tab and click on Add Relation.
. Open theName is going to be used as a title. It should be a human readable string describing what the relation is used for. We will just call say airport_relation in this case.
Referenced Layer (Parent) also considered as parent layer, is the one with the primary key, pointed to, so here it is the
regions
layer. You need to define the primary key of the referenced layer, so it isID
.Referencing Layer (Child) also considered as child layer, is the one with the foreign key field on it. In our case, this is the
airports
layer. For this layer you need to add a referencing field which points to the other layer, so this isfk_region
.Note
Sometimes, you need more than a single field to uniquely identify features in a layer. Creating a relation with such a layer requires a composite key, i.e. more than a single pair of matching fields. Use the Add new field pair as part of a composite foreign key button to add as many pairs as necessary.
Id will be used for internal purposes and has to be unique. You may need it to build custom forms. If you leave it empty, one will be generated for you but you can assign one yourself to get one that is easier to handle
Relationship strength sets the strength of the relation between the parent and the child layer. The default Association type means that the parent layer is simply linked to the child one while the Composition type allows you to duplicate also the child features when duplicating the parent ones and on deleting a feature the children are deleted as well, resulting in cascade over all levels (means children of children of… are deleted as well).
From the Relations tab, you can also press the Discover Relation button to fetch the relations available from the providers of the loaded layers. This is possible for layers stored in data providers like PostgreSQL or SpatiaLite.
Forms for 1-N relations
Now that QGIS knows about the relation, it will be used to improve the forms it generates. As we did not change the default form method (autogenerated), it will just add a new widget in our form. So let’s select the layer region in the legend and use the identify tool. Depending on your settings, the form might open directly or you will have to choose to open it in the identification dialog under actions.
As you can see, the airports assigned to this particular region are all shown in a table. And there are also some buttons available. Let’s review them shortly:
The button is for toggling the edit mode. Be aware that it toggles the edit mode of the airport layer, although we are in the feature form of a feature from the region layer. But the table is representing features of the airport layer.
The button is for saving all the edits in the child layer (airport).
The button lets you digitize the airport geometry in the map canvas and assigns the new feature to the current region by default. Note that the icon will change according to the geometry type.
The button adds a new record to the airport layer attribute table and assigns the new feature to the current region by default. The geometry can be drawn later with the Add part digitizing tool.
The button allows you to copy and paste one or more child features within the child layer. They can later be assigned to a different parent feature or have their attributes modified.
The symbol opens a new dialog where you can select any existing airport which will then be assigned to the current region. This may be handy if you created the airport on the wrong region by accident.
The symbol unlinks the selected airport(s) from the current region, leaving them unassigned (the foreign key is set to NULL) effectively.
With the button you can zoom the map to the selected child features.
The two buttons and to the right switch between the table view and form view of the related child features.
If you use the Drag and Drop Designer for the regions feature, you can select which tools are available. You can even decide whether to open a new form when a new feature is added using Force hide form on add feature option. Be aware that this option implies that not null attributes must take a valid default value to work correctly.
In the above example the referencing layer has geometries (so it isn’t just an alphanumeric table) so the above steps will create an entry in the layer attribute table that has no corresponding geometric feature. To add the geometry:
Select the record that has been added previously within the feature form of the referenced layer.
Use the Add Part digitizing tool to attach a geometry to the selected attributes table record.
If you work on the airport table, the widget Relation Reference is automatically set up
for the fk_region
field (the one used to create the relation),
see Relation Reference widget.
In the airport form you will see the button at the right side of the fk_region
field:
if you click on the button the form of the region layer will be opened.
This widget allows you to easily and quickly open the forms of the linked parent features.
The Relation Reference widget has also an option to embed the form of the parent layer within the child one.
It is available in the fk_region
field and check the Show embedded form
option.
If you look at the feature dialog now, you will see that the form of the region is embedded inside the airports form and will even have a combobox, which allows you to assign the current airport to another region.
Moreover if you toggle the editing mode of the airport layer,
the fk_region
field has also an autocompleter function:
while typing you will see all the values of the id
field of the region layer.
Here it is possible to digitize a polygon for the region layer using the button
if you chose the option Allow adding new features
in the menu of the airport layer.
The child layer can also be used in the Select Features By Value tool in order to select features of the parent layer based on attributes of their children.
In Fig. 16.111, all the regions where the mean altitude of the airports is greater than 500 meters above sea level are selected.
You will find that many different aggregation functions are available in the form.
16.4.2.2. Many-to-many (N-M) relations
N-M relations are many-to-many relations between two tables.
For instance, the airports
and airlines
layers:
an airport receives several airline companies
and an airline company flies to several airports.
This SQL code creates the three tables we need for an N-M relationship
in a PostgreSQL/PostGIS schema named locations.
You can run the code using the pgAdmin.
The airports table stores the airports
layer and the airline table stores the airlines
layer.
In both tables few fields are used for clarity.
The tricky part is the airports_airlines
table.
We need it to list all airlines for all airports (or vice versa).
This kind of table is known as a pivot table.
The constraints in this table force that an airport can be associated with an airline
only if both already exist in their layers.
CREATE SCHEMA locations;
CREATE TABLE locations.airports
(
id serial NOT NULL,
geom geometry(Point, 4326) NOT NULL,
airport_name text NOT NULL,
CONSTRAINT airports_pkey PRIMARY KEY (id)
);
CREATE INDEX airports_geom_idx ON locations.airports USING gist (geom);
CREATE TABLE locations.airlines
(
id serial NOT NULL,
geom geometry(Point, 4326) NOT NULL,
airline_name text NOT NULL,
CONSTRAINT airlines_pkey PRIMARY KEY (id)
);
CREATE INDEX airlines_geom_idx ON locations.airlines USING gist (geom);
CREATE TABLE locations.airports_airlines
(
id serial NOT NULL,
airport_fk integer NOT NULL,
airline_fk integer NOT NULL,
CONSTRAINT airports_airlines_pkey PRIMARY KEY (id),
CONSTRAINT airports_airlines_airport_fk_fkey FOREIGN KEY (airport_fk)
REFERENCES locations.airports (id)
ON DELETE CASCADE
ON UPDATE CASCADE
DEFERRABLE INITIALLY DEFERRED,
CONSTRAINT airports_airlines_airline_fk_fkey FOREIGN KEY (airline_fk)
REFERENCES locations.airlines (id)
ON DELETE CASCADE
ON UPDATE CASCADE
DEFERRABLE INITIALLY DEFERRED
);
Instead of PostgreSQL you can also use GeoPackage. In this case, the three tables can be created manually using the
. In GeoPackage there are no schemas so the locations prefix is not needed.Foreign key constraints in airports_airlines
table can´t be created
using or
so they should be created using .
GeoPackage doesn’t support ADD CONSTRAINT statements so the airports_airlines
table should be created in two steps:
Set up the table only with the
id
field usingUsing
, type and execute this SQL code:ALTER TABLE airports_airlines ADD COLUMN airport_fk INTEGER REFERENCES airports (id) ON DELETE CASCADE ON UPDATE CASCADE DEFERRABLE INITIALLY DEFERRED; ALTER TABLE airports_airlines ADD COLUMN airline_fk INTEGER REFERENCES airlines (id) ON DELETE CASCADE ON UPDATE CASCADE DEFERRABLE INITIALLY DEFERRED;
Then in QGIS, you should set up two one-to-many relations as explained above:
a relation between
airlines
table and the pivot table;and a second one between
airports
table and the pivot table.
An easier way to do it (only for PostgreSQL) is using the Discover Relations in . QGIS will automatically read all relations in your database and you only have to select the two you need. Remember to load the three tables in the QGIS project first.
In case you want to remove an airport
or an airline
,
QGIS won’t remove the associated record(s) in airports_airlines
table.
This task will be made by the database if we specify the right constraints
in the pivot table creation as in the current example.
Note
Combining N-M relation with automatic transaction group
You should enable the transaction mode in
when working on such context. QGIS should be able to add or update row(s) in all tables (airlines, airports and the pivot tables).Finally we have to select the right cardinality
in the airports
and airlines
layers.
For the first one we should choose the airlines (id) option
and for the second one the airports (id) option.
Now you can associate an airport with an airline (or an airline with an airport)
using Add child feature or Link existing child feature in the subforms.
A record will automatically be inserted in the airports_airlines
table.
Note
Using Many to one relation cardinality
Sometimes hiding the pivot table in an N-M relationship is not desirable. Mainly because there are attributes in the relationship that can only have values when a relationship is established. If your tables have a geometry field, it could be interesting to activate the On map identification option ( ) for the foreign key fields in the pivot table.
Note
Pivot table primary key
Avoid using multiple fields in the primary key in a pivot table.
QGIS assumes a single primary key so a constraint like
constraint airports_airlines_pkey primary key (airport_fk, airline_fk)
will not work.
16.4.2.3. Polymorphic relations
The purpose
Polymorphic relations are special case of 1-N relations,
where a single referencing (document) layer contains the features for multiple referenced layers.
This differs from normal relations which require different referencing layer for each referenced layer.
A single referencing (document) layer is achieved by adding an adiditonal layer_field
column
in the referencing (document) layer that stores information to identify the referenced layer.
In its most simple form, the referencing (document) layer will just insert the layer name
of the referenced layer into this field.
To be more precise, a polymorphic relation is a set of normal relations having the same referencing layer but having the referenced layer dynamically defined. The polymorphic setting of the layer is solved by using an expression which has to match some properties of the referenced layer like the table name, layer id, layer name.
Imagine we are going to the park and want to take pictures of different species
of plants
and animals
we see there.
Each plant or animal has multiple pictures associated with it,
so if we use the normal 1:N relations to store pictures, we would need two separate tables,
animal_images
and plant_images
.
This might not be a problem for 2 tables, but imagine if we want to take separate pictures
for mushrooms, birds etc.
Polymorphic relations solve this problem as all the referencing features are stored
in the same table documents
.
For each feature the referenced layer is stored in the referenced_layer
field
and the referenced feature id in the referenced_fk
field.
Defining polymorphic relations
First, let QGIS know about the polymorphic relations between the layers. This is done in Relations tab and click on the little down arrow next to the Add Relation button, so you can select the Add Polymorphic Relation option from the newly appeared dropdown.
. Open theId will be used for internal purposes and has to be unique. You may need it to build custom forms. If you leave it empty, one will be generated for you but you can assign one yourself to get one that is easier to handle
Referencing Layer (Child) also considered as child layer, is the one with the foreign key field on it. In our case, this is the
documents
layer. For this layer you need to add a referencing field which points to the other layer, so this isreferenced_fk
.Note
Sometimes, you need more than a single field to uniquely identify features in a layer. Creating a relation with such a layer requires a composite key, i.e. more than a single pair of matching fields. Use the Add new field pair as part of a composite foreign key button to add as many pairs as necessary.
Layer Field is the field in the referencing table that stores the result of the evaluated layer expression which is the referencing table that this feature belongs to. In our example, this would be the
referenced_layer
field.Layer expression evaluates to a unique identifier of the layer. This can be the layer name
@layer_name
, the layer id@layer_id
, the layer’s table namedecode_uri(@layer, 'table')
or anything that can uniquely identifies a layer.Relationship strength sets the strength of the generated relations between the parent and the child layer. The default Association type means that the parent layer is simply linked to the child one while the Composition type allows you to duplicate also the child features when duplicating the parent ones and on deleting a feature the children are deleted as well, resulting in cascade over all levels (means children of children of… are deleted as well).
Referenced Layers also considered as parent layers, are those with the primary key, pointed to, so here they would be
plants
andanimals
layers. You need to define the primary key of the referenced layers from the dropdown, so it isfid
. Note that the definition of a valid primary key requires all the referenced layers to have a field with that name. If there is no such field you cannot save a polymorphic relation.
Once added, the polymorphic relation can be edited via the Edit Polymorphic Relation menu entry.
The example above uses the following database schema:
CREATE SCHEMA park;
CREATE TABLE park.animals
(
fid serial NOT NULL,
geom geometry(Point, 4326) NOT NULL,
animal_species text NOT NULL,
CONSTRAINT animals_pkey PRIMARY KEY (fid)
);
CREATE INDEX animals_geom_idx ON park.animals USING gist (geom);
CREATE TABLE park.plants
(
fid serial NOT NULL,
geom geometry(Point, 4326) NOT NULL,
plant_species text NOT NULL,
CONSTRAINT plants_pkey PRIMARY KEY (fid)
);
CREATE INDEX plants_geom_idx ON park.plants USING gist (geom);
CREATE TABLE park.documents
(
fid serial NOT NULL,
referenced_layer text NOT NULL,
referenced_fk integer NOT NULL,
image_filename text NOT NULL,
CONSTRAINT documents_pkey PRIMARY KEY (fid)
);