Building a Teradata Object Catalog

Recently I was asked to provide an inventory of a production Teradata system that included a list of all user created objects and as much information about the objects that I could gather included perm space usage, row counts, and object type. Two of the three could easily be gathered using the data dictionary tables in DBC. However, the row count requirement made me pause for a moment and weigh the impact this could have. The production system that I was working on had hundreds of databases and thousands of tables.

I knew writing an SQL script that would build the SELECT queries that would perform a COUNT(*) against every table and capture that in a flat file or table on the system was trivial. However, when you have a system with thousands or tens of thousands of objects it would take quite some time to run all of those statements even if many of the tables are relatively small. It just didn’t make good sense to take that approach.

Then I realized that nearly every table had some level of statistics collected against it and maintained at least semi-regularly depending on the DBA that was responsible for the database. The trouble is that Teradata’s HELP STATISTICS command only shows you the number of unique values for each set of columns containing statistics. But then I remembered that the view I used as part of my stats maintenance routine did have a column for the number of rows. The views that I use as the basis for my stats maintenance routine were written by Dieter Noeth and is available for the general public to use courtesy of the TeradataForum. For the sake of brevity and maintaining a single source for the code I am not going to post the SQL for the views here but you can find them here in the Attachment Library over at the TeradataForum. (It should be noted that there are two different sets of views that support both the 32 bit and 64 bit versions of Teradata. Make sure you use the correct set of views for your installation.)

In order to keep things organized there are three views used to gather the information instead of creating a bunch of derived tables. The view Stats_Data is responsible for collapsing the number of fields into a comma delimited list for the first 16 columns and then extracting the pertinent binary data for the statistics collected on a given set of columns. The second view Stats_Basics_[32/64]bit extrapolates the binary data from the Stats_Data view into individual binary fields that require further refining. The final view Stats_Details composes the binary data into its final numeric and temporal values from which additional reporting and analysis can be done.

With these three views in place the Object_Catalog view could easily be constructed by simply bringing together a view additional data dictionary tables and the Stats_Details view. The Object_Catalog view displays the following attributes about the objects in each database on the system:

  • Database Name – self-explanatory
  • Object Name – self-explanatory
  • Column Count – This is a count of the number of fields for the object in TVFields.
  • Table Rows – The number of rows on the table when the stats were last collected. NULL if stats are not collected.
  • Table Size – The current perm space being consumed by the table.
  • Table Missing Stats – Indicator field for if stats exist on the table. NULL for non-table objects.
  • Table Kind – A single character field categorizing the object type. (e.g. T, V, M, etc.)
  • Table Kind Description – A description of the object type (e.g. Table, View, Macro, etc.)

The SQL for the Object_Catalog view can be found below:

REPLACE VIEW SysDBA.Object_Catalog AS
SELECT DISTINCT
       T.DatabaseName
     , T.ObjectName
     , DT4.ColumnCount
     , CAST(DT2.NumRows AS BIGINT) AS TableRows
     , DT3.TableSize
     , CASE WHEN T.Tablekind = 'T'
             AND TableRows IS NULL
            THEN 'y'
            WHEN T.Tablekind = 'T'
             AND Tablerows > 0
            THEN 'n'
            ELSE NULL
       END AS TableMissingStats
     , T.TableKind
     , CASE WHEN T.TableKind = 'A' THEN 'Aggregate function'
            WHEN T.TableKind = 'B' THEN 'Combined aggregate and ordered analytical function'
            WHEN T.TableKind = 'D' THEN 'JAR'
            WHEN T.TableKind = 'E' THEN 'External Stored Procedure'
            WHEN T.TableKind = 'F' THEN 'Standard function'
            WHEN T.TableKind = 'G' THEN 'Trigger'
            WHEN T.TableKind = 'H' THEN 'Instance or Constructor Method'
            WHEN T.TableKind = 'I' THEN 'Join Index'
            WHEN T.TableKind = 'J' THEN 'Journal'
            WHEN T.TableKind = 'M' THEN 'Macro'
            WHEN T.TableKind = 'N' THEN 'Hash Index'
            WHEN T.TableKind = 'P' THEN 'SQL Procedure'
            WHEN T.TableKind = 'Q' THEN 'Queue Table'
            WHEN T.TableKind = 'R' THEN 'Table function'
            WHEN T.TableKind = 'S' THEN 'Ordered analytical function'
            WHEN T.TableKind = 'T' THEN 'Table'
            WHEN T.TableKind = 'U' THEN 'User-defined data type'
            WHEN T.TableKind = 'V' THEN 'View'
            WHEN T.TableKind = 'X' THEN 'Authorization'
            ELSE NULL
       END AS TableKindDescription
FROM DBC.TABLES T
 LEFT JOIN
    (SELECT DatabaseName
          , TableName
          , NumRows
     FROM SysDBA.Stats_Details
     QUALIFY CollectTimeStamp
           = MAX(CollectTimestamp)
             OVER(PARTITION BY DatabaseName, TableName)
         AND ROW_NUMBER()
             OVER(PARTITION BY DatabaseName, TableName
                  ORDER BY CollectTimestamp DESC)
           = 1
    )     DT2
        ON T.DatabaseName = DT2.DatabaseName
       AND T.TableName    = DT2.TableName
LEFT JOIN
    (SELECT DatabaseName
          , TableName
          , SUM(CurrentPerm) AS TableSize
     FROM DBC.TableSize
     GROUP BY 1,2
    ) DT3
       ON T.DatabaseName = DT3.DatabaseName
      AND T.TableName    = DT3.TableName
LEFT JOIN
    (SELECT D.DATABASENAME
          , T.TVMNAME        AS TableName
          , COUNT(C.FieldId) AS ColumnCount
     FROM DBC.TVM T
        , DBC.TVFields C
        , DBC.Dbase D
     WHERE T.TVMId = C.TableId
       AND T.DatabaseId = D.DatabaseId
     GROUP BY 1,2
    ) DT4
      ON T.DatabaseName = DT4.DatabaseName
     AND T.TableName    = DT4.TableName
;

8 responses

  1. Monis, I have revised the blog post itself to post a direct link to the current version. It should be noted that it could change later, but you should always be able to find it in the Attachment Library.

  2. Is there a way to collect tablwsize info based on all tables only for a specific user has crated?

  3. Hi Rob. I hae no words to describe my ecstacy when i saw ur
    website. I am a budding Teradata developer, and am working as a
    Database Analyst at a Indian company. Currently, i am struggling to
    develop a Field Catalog for a insurance organisation, who is our
    client. Now, what they have given me is a schema, which has some
    views( that are created from base tables which reside in
    theproduction db). Now they have asked me to create a catalog,
    which should aid analysts to make business models. This catalaog,
    according to them, should contain definitions of various variables
    from the schema. Now, i want to ask you, how shud i begin doing it.
    First of al, i want to write a macro, which can use dbc tables, to
    tell me what are the base tables for the views in that particular
    schema. can it be done?( i want to kindda automate it, since it
    will be used on other schemas as well, and it should be noted here,
    that each schema holds around 60 views.) Thanks. I hope yo will
    reply and help me clear my doubts. And please don’t mind if i keep
    coming up wid bother-ful questions(wow, if for nothing, i devised a
    new slang!) Bye and take care. Prashant

  4. Hello Prashant,

    If I understand correctly, you are trying to take a view definition and obtain the column definitions and the base table from which it was defined? The column definitions from the view definition are easy. You can find them in DBC.Columns under where the TableName = and DatabaseName = . However, making taking the next step to get back to the source table is almost impossible. You could do some experiments of comparing what DBC.Columns (or DBC.TVFields) shows for the view column and base table. From there you might be able to draw some conclusions as to what your being asked to deliver.

    Let me know if I misunderstood your question.

Comments are closed.