SAS Dictionary Tables and Real World Example Use Cases
The DICTIONARY tables discussed in this article are a set of pre-defined tables available in your SAS session that provide extensive metadata (i.e. data about data) for your SAS environment. Using PROC SQL, you can easily query the DICTIONARY tables to find a wealth of information about your SAS session.
In this article, we will show you a number of different use cases for the DICTIONARY Tables. While there are a total of 32 different dictionary tables, this article will focus on DICTIONARIES, LIBNAMES, TABLES, COLUMNS, MACROS, OPTIONS, CATALOGS . While some information is provided about the sample SQL queries, some basic background knowledge of PROC SQL syntax is assumed.
Before we continue, make sure you have SAS Studio or SAS 9.4 installed. Don't have the software? Download SAS Studio now. It's free!
The examples used in this article are based on the DICTIONARY Tables. The DICTIONARY tables are valid through PROC SQL code only using the pre-defined library, DICTIONARY. Alternatively, views of these tables are available in the pre-defined library SASHELP. The SASHELP views are those members starting with the letter “V” such as VCOLUMN and others shown below:
This article will focus on querying the DICTIONARY tables using PROC SQL, but note that it is possible to conduct similar queries using SAS DATA step and SASHELP views.
[Don't have the software yet? Download SAS Studio here for free.]
1. Obtain a list of the DICTIONARY tables – DICTIONARY.DICTIONARIES
Before you start using the DICTIONARY tables it would of course be helpful to know what they contain. Fortunately, there is a DICTIONARY table that lists all the other DICTIONARY tables available.
The DICTIONARIES table within the DICTIONARY library provides a helpful list of all the tables, the columns they contain, and individual column attributes.
To view this dataset, you can use the following PROC SQL code:
proc sql; select * from dictionary.dictionaries ; quit;
Which will produce the output shown (partially) below:
To determine the structure of the DICTIONARIES table itself, you can use the PROC SQL DESCRIBE TABLE statement as shown here on the DICTIONARIES table within the DICTIONARY library:
After running the DESCRIBE TABLE statement, you can see the table structure for this table in your SAS log, shown in part below:
An alternative way to see the table structure and find more information about a given DICTIONARY table is to create your own copy of a selected table in the WORK library.
This can be done by simply adding the CREATE TABLE statement to the start of your PROC SQL query. To create a copy of the DICTIONARIES table in your WORK directory, you can use the following SAS code:
proc sql; create table dictionaries as select * from dictionary.dictionaries ; quit;
After running the above code, you should now have a copy of DICTIONARIES in your WORK library, as shown here:
Using these basic concepts as a starting point, you can now explore and start to take advantage of the many other DICTIONARY tables available. While the DICTIONARY tables can be used in many ways, this article will outline some of the more common uses, and provide simple examples for you to build upon.
2.Analyze Library Metadata
If you have many libraries available to you in your SAS session, the DICTIONARY.LIBNAMES table can be a useful resource. First, by creating a temporary table in WORK that contains all columns and rows from DICTIONARY.LIBNAMES, you can browse the contents of the LIBNAMES table using the following code:
proc sql; create table libraries as select * from dictionary.libnames ; quit;
After running the above code, you should now see the output dataset, WORK.LIBRARIES under the OUTPUT DATA tab of your SAS Studio session:
Depending on your environment and libraries available, you may find many of the system related fields useful. In most cases however, the path and libname fields would contain the most relevant information. The following examples provides some helpful use cases for querying the LIBRARIES table.
Example 1 – Create a table Listing All Libraries available in your SAS session
Using the LIBNAME column and a SELECT DISTINCT statement, you can easily create a new table, LIB_LIST, in WORK using the following code:
proc sql; create table lib_list as select distinct libname from dictionary.libnames ; quit;
This output produced may vary depending on your environment and if you have any other user defined libraries, but with a new SAS Studio session you should see the following 4 libraries:
Example 2 – Determine # of paths are associated with a library
In SAS, it is possible to have more than 1 path (directory) associated with a library. To see which libraries have more than one path associated with them, you can use the LIBNAME and PATH columns combined with a PROC SQL COUNT and GROUP BY statements. Note that you also need to include the DISTINCT qualifier in front of the PATH so that the number of paths is not double counted within each libname:
proc sql; select distinct libname,count(distinct path) label='Path Count' from dictionary.libnames group by libname ; quit;
As you can see in the output, SASHELP is the only library with more than 1 path, having a total of 4 paths associated with it:
Example 3 – List all the paths referenced by SASHELP
If you would like to see exactly which paths are referenced by the SASHELP library, you can use the following code:
libname, path from dictionary.tables
The output should now list only the paths from SASHELP, as shown here:
Are you totally new to SAS?
Take our Practical SAS Training Course for Beginners and learn how to code your first SAS program!
If you’d like to know more information about the tables currently found in your SAS session, you can take advantage of the DICTIONARY.TABLES dataset.
The TABLES dataset contains extensive metadata about the various tables in your SAS session such as the creation and modified dates, the number of observations, file size and many other attributes.
First, to get a sense of the information available to you in the TABLES dataset, you can create a copy of it in your WORK library using the following code:
proc sql; create table tables as select * from dictionary.tables ; quit;
You should now have a dataset called TABLES in WORK with 165 rows and 41 columns, as shown here:
While there is no limit to the number of different queries you can run on the TABLES dataset, the following examples may provide some helpful use cases when you are trying to programmatically find out more information about the the tables/datasets in your SAS environment.
Example 1 – Number of datasets in SASHELP library
To determine the number of datasets found within the SASHELP library, you can use the COUNT statement as shown below. The label option is added to make the output more descriptive and the WHERE statement filters on tables found only the the SASHELP library:
proc sql; select count(memname) label='# of datasets in SASHELP' from dictionary.tables where libname='SASHELP' ; quit;
In the output produced, you can now see the total number of datasets found in the SASHELP library:
Example 2 – Index Utilization in SASHELP library
To determine which datasets have an index associated with them, you can use the INDXTYPE column and the SQL code shown here:
proc sql; select indxtype,count(indxtype) as count from dictionary.tables where libname='SASHELP' group by indxtype ; quit;
In the output table, you can how many datasets have no index (the first blank row), BOTH types of indexes or either a simple or composite index:
Example 3 – List Large Datasets with greater than 10,000 observations and 10 variables
Using the MEMNAME (table name), NOBS (number of observations) and NVAR (number of variables) columns from DICTIONARY.TABLES, you can list out “large” datasets. Depending on your data you may classify large datasets differently, but in SASHELP those datasets with more than 10,000 observations and 10 variables or more could be considered the largest:
proc sql; select memname,nobs,nvar from dictionary.tables where libname='SASHELP' and nobs > 10000 and nvar > 10 ; quit;
Using the above query, 4 datasets meet our criteria for “large”, and the number of observations and variables for each are shown in the output:
4.Query and Summarize Column Level Information
To drill deeper into the metadata beyond the table level, you can use the DICTIONARY tables to find out all you would need to know about the columns (variables) found within the SAS datasets of your SAS session.
Again, let’s start by creating a temporary copy of the COLUMNS table to explore further using a simple CREATE TABLE and SELECT statement within PROC SQL:
proc sql; create table columns as select * from dictionary.columns ; quit;
Since the COLUMNS table list all variables found in all datasets from your SAS session, you will notice it has over 16,000 rows. This number may vary of course if you have other datasets available in your SAS session.
As you will see, the COLUMNS dataset has all the variable level attributes you would need for a vast array of queries. Here you can find information about the variable names, labels, formats, lengths, and source.
Using this information, there are a number of interesting use cases for examining your variables further.
Example 1 – Count and List Variables with Missing Labels
If you are trying to clean and prepare a SAS dataset, it may be helpful to find out how many variables in your environment are missing variable labels. With the following code, the COUNT statement is used with the MISSING function to determine how many variables have no label in the SASHELP library:
proc sql; select count(name) label='Vars with Missing label' from dictionary.columns where missing(label) and libname='SASHELP' ; quit;
As you will see from the output, there are quite a few variables (15,994) missing labels in the SASHELP library:
To pursue these variables further, you can modify the above query to create a temporary table, MISSING_LABELS which contains the dataset and variable names for those variables which are missing labels:
proc sql; create table missing_labels as select memname,name from dictionary.columns where missing(label) and libname='SASHELP' ; quit;
In the output shown (partially) below, you can now browse the list of variables and dataset from which they came from that have no variable labels:
Example 2 – Count Common Variables Across Datasets
When you have a large number of datasets in your environment, in may be helpful to know which datasets contain the same variable. Depending on your data, this may be helpful to determine how different datasets link together or to remove redundant variables that are found in more than one dataset.
Here, we can use the HAVING statement along with the GROUP BY to only select those variables that show up more than 2 times in any dataset in the SASHELP library:
proc sql; select name,count(name) from dictionary.columns group by name having count(name) > 2 ; quit;
One problem you may notice when you examine the output (shown partially below) is that some variables such as ID show up twice (once as “ID” and once as “id”):
This occurs because the query is case sensitive, so you need to use either the LOWCASE or UPCASE function to group these together, and then add DISTINCT to remove duplicate entries:
proc sql; select distinct upcase(name) label='Name',count(upcase(name)) label='Count' from dictionary.columns group by upcase(name) having count(upcase(name)) > 2 ; quit;
You will now see that 11 entries of “ID” have been combined with the 4 entries for “id” to give a total count of 15:
Example 3 – Locate Common Variables Across Datasets
Once you know that a variable occurs in more than one of your datasets, it may be helpful to know which datasets that variable is found in. To do this, we can query the COLUMNS table to only show us those records where Name (column name) is equal to “ID”. Again, the upcase function is used to adjust for the different cases:
proc sql; select libname,memname,upcase(name) label='Column Name' from dictionary.columns where upcase(name)='ID' ; quit;
In the output shown below, you can see the library and dataset name where each of the 15 occurrences of ID are found:
Need help studying for the new SAS Certified Specialist Exam?
In SAS macro programming, it is common to create and use a large number number of macros. The MACROS DICTIONARY table provides a convenient way to view and query the current values of both system generated and user created macro variables.
First, let’s take a look at the MACROS table by selecting all rows and columns:
proc sql; select * from dictionary.macros ; quit;
Out of the 4 columns found in the MACROS table, the scope (Macro Scope), name (Macro Variable Name) and value (Macro Variable Value) columns are likely of greatest interest.
To explore the MACROS table a bit more, we can create a simple macro to add a few new macro variables to our SAS session. By calling a SQL select statement within the macro %test, we can see the 3 new macro variables created within this simple macro:
proc sql; select scope,count(scope) label='Macro Variable Count' from dictionary.macros group by scope ; quit;
In the output shown below, you will see that there are 3 distinct values for SCOPE:
AUTOMATIC - the automatic macro variables created by SAS when your session starts
GLOBAL - either user defined or SAS created macro variables available to you anywhere in your SAS session
TEST – Local macro variables available during the execute of the %test macro you just created
Example 2 – Check Macro Variable Values
The MACROS table is also useful for listing the actual current value of macro variables. For example, the system generated date macro variables (SYSDATE, SYSDATE9 and SYSDAY) are often useful within other macros or SAS program. To see the current values of just these macro variables, you can use the following code:
proc sql; select name, value from dictionary.macros where name in ('SYSDATE','SYSDATE9','SYSDAY') ; quit;
Which produces the following output:
Note these values are dynamic so they will of course change depending on when you run this code.
6.Viewing Currently Defined SAS Options
As you are likely aware, the number of customizable options within a SAS session is extensive. Many options can be set either programmatically or using the point-and-click graphical user interface within SAS.
Keeping track of which options you have set can of course be challenging, but fortunately the OPTIONS DICTIONARY table provides a convenient way to list all the SAS options and their current settings.
Since the options list is long, it is helpful to first create a temporary table that lists all the options and their settings, using the following code:
proc sql; create table options as select * from dictionary.options ; quit;
You will see in the output dataset (shown partially below) that there are a total of 458 options listed.
In this table, the greatest utility will likely come from the optname (Option Name), optdesc (Option Description) and setting (Option Setting) columns. Let’s look at some use cases for these fields.
Example 1 – Check One or More Option Settings
If you want to check what the settings are for one or more options, you can use a simple SQL query and filter on the desired option name.
For example, to find out the order of libraries that SAS searches for formats (FMTSEARCH) and how SAS handles a situation where a variable format cannot be found (FMTERR), you can run a SQL query to check the settings for these two options:
proc sql; select optname,optdesc,setting from dictionary.options where optname in ('FMTSEARCH','FMTERR') ; quit;
Which will produce the following output:
Example 2 – Keyword Search: Find Macro Related Options
When you are working on a SAS macro, it is helpful to know what options are currently set that may affect the macro.
Using the SQL LIKE statement with the wild card operator (%) ,you can run a keyword search of the Option Description field to return a list of all options which have the word “macro” in the description:
proc sql; select optname,optdesc,setting from dictionary.options where optdesc like ('%macro%') ; quit;
As you will see in the resulting output (shown partially below), all the options, their descriptions, and current settings that contain the word “macro” in the description are listed:
7.Browsing SAS Catalog Entries
The CATALOG DICTIONARY table can also be used to browse the various objects within your SAS Catalog entries.
Even with a standard SAS session, there are nearly 10,000 objects found in the default SAS catalogs. To browse the entries and get a sense of the information available to you in the CATALOGS DICTIONARY table you can start by creating a temporary copy of CATALOGS in WORK using the following code:
proc sql; create table catalogs as select * from dictionary.catalogs ; quit;
Which will produce the following dataset, shown partially below:
If you are unfamiliar with the types of catalog entries available in your SAS catalogs, you can start by summarizing the different Object Types using the objtype column and the COUNT function in SQL:
proc sql; select objtype,count(objtype) label='Count of Object Types' from dictionary.catalogs group by objtype ; quit;
As you can see in the resulting output shown partially below, there are a wide range of Object Types listed within the catalogs:
Example 1 – List all Macros stored within the Macro catalog
To create a simple list of cataloged macros, you can use the Object Type column to filter the results and then list the libname (Library Name) and objname (Macro name) as shown here:
proc sql; select libname,objname from dictionary.catalogs where objtype = 'MACRO' ; quit;
Which will produce the following output, shown partially below: