Unlock hidden insights; Build a stunning Grafana dashboard to monitor OPA like a bro (part 3)

Hi there “Process Automation” fans,

Welcome to a new installment of “Process Automation” tips.

Last week, we saw our first Grafana dashboard passing by nicely embedded in an entity layout reading the metrics of a JSON exporter. Great stuff, but there is more data available to explore via metrics.

It’s time to continue our journey on Grafana connecting the OPA database with a fascinating exporter! It was not the intention to do it with this exporter (see below), but with new knowledge, you get new insights which brings you to new areas of expertise.

Let’s get right into it…

BUT…First some upfront insights on directions you do NOT need to go to!

Last week we learned about the “JSON exporter”; Interesting stuff and works great. This post will dive into other data (in the database), and I was just wondering how to grab this information! First I thought about calling HTTP SOAP services over the “Blackbox exporter”, but with all the authentication over OTDS it would be a nightmare…We need to be closer to our target which is our database! So, searching brought me to the “Postgres exporter”; Super interesting, but I’m totally missing the point here as this exporter has a different use-case which is monitoring, alerts, and performance tracking of the DB itself where I want to visualize real application data (our entity-, bpm-, and lifecycle-instances!)

A moment of “AHA” brings me this question: Why don’t you simply use the Postgres data source like you also got your hands dirty on the Prometheus data source!? WHAT! Yes, as straightforward as that! #OMG…Couldn’t you tell me this in the first place!

So, now you know…First think, then act; I also do it sometimes the other way around! 🙃

Another valid question would be: Do we still need the JSON exporter!? As you can find some of these details in the database…correct? Well, not exactly!? I did a quick research and each service container has a socket URI with a port number and the platform internally checks whether it’s up or down; Probably with some kind of pingy-thing. Not for now…We keep it with other interesting SQL statements from a Postgres perspective.

I think it’s about time to explore some things now…

…

WAIT…STOP!! My great friend ChatGPT comes to the rescue with another approach/advice! Querying the DB from a Grafana standpoint adds load (with a scheduled query) to the tables, does not build a history in a time-series DB (in Prometheus), AND the post after this one will implement alerting from a Prometheus point of view which brings us closer to the metrics fire. So, what we need is the creation of a custom exporter for my specific “counter” queries; This sounds better and is more fun to do! Ok, but a custom exporter still needs to run a query over a scheduled period! Right? Yes, but that’s pure counters which is cheap for performance.

…

So, now the target is set for this post on creating a custom Python exporter with the help of ChatGPT to generate metrics data out of our queries as input for Prometheus…

Interesting queries to play with

Before we start building, we first make sure to get some interesting SQL queries to play with. For this we’ll increase the log details on our Postgres instance with these commands:

sudo vi /var/lib/pgsql/data/postgresql.conf
#Enable this line:
log_statement = 'all'
#Restart Postgres:
systemctl restart postgresql
sudo tail -999f /var/lib/pgsql/data/log/postgresql-{day}.log

Once logging is ready, we can simply do some interesting actions in the UI of OPA that eventually will evolve into SELECT-statements at DB log-level. Make sure you have some BPM instances available and get the overview of the PIM (the consolidated view) generating a statement like this:

-- BPM instance overview of the PIM!
SELECT bpm_model.model_name                         AS PROCESS_NAME,
       SUBRESULT.status,
       bpm_model.model_space                        AS MODEL_SPACE,
       SUBRESULT.cnt,
       bpm_model_translation.translated_description AS DESCRIPTION,
       bpm_model_revision.description               AS DEFAULT_DESCRIPTION
FROM   (SELECT process_instance.model_id,
               process_instance.status,
               Count(process_instance.status) AS CNT
        FROM   process_instance
        WHERE  ( process_instance.organization =
                 'o=opa_tips,cn=cordys,cn=defaultInst,o=25.1.com' )
        GROUP  BY process_instance.model_id,
                  process_instance.status) SUBRESULT
       INNER JOIN bpm_model
               ON ( bpm_model.model_id = SUBRESULT.model_id )
       INNER JOIN bpm_model_revision
               ON ( bpm_model_revision.revision_id =
                  bpm_model.current_revision )
       LEFT OUTER JOIN bpm_model_translation
                    ON ( bpm_model_revision.revision_id =
                         bpm_model_translation.model_revision_id
                         AND language = 'en-US' );

You can also get the “all instances” view and filtering out on “ABORTED” state with a statement like this in the log:

-- Count of ABORTED BPM instances
SELECT Count(process_instance.instance_id)    AS "COUNT"
FROM   process_instance
WHERE  ( process_instance.organization =
                'o=opa_tips,cn=cordys,cn=defaultInst,o=25.1.com' )
       AND ( process_instance.status IN ( 'ABORTED' ) )

That’s the same for entities with a lifecycle. Create instances in runtime, watch the CIM and see this passing by:

-- Case instance overview of the CIM!
SELECT bpm_model.model_id         AS MODEL_ID,
       bpm_model.model_name       AS MODEL_NAME,
       bpm_model.current_revision AS CURRENT_REVISION,
       bpm_model.status           AS "CASE_MODEL_STATUS",
       CASE_INSTANCE.status       AS CASE_INSTANCE_STATUS,
       CASE_INSTANCE.count        AS "COUNT"
FROM   bpm_model,
       (SELECT case_instance.case_model,
               case_instance.status,
               Count(case_instance.status) AS COUNT
        FROM   case_instance
        WHERE  ( case_instance.organization =
                 'o=opa_tips,cn=cordys,cn=defaultInst,o=25.1.com'
               )
        GROUP  BY case_instance.case_model,
                  case_instance.status) CASE_INSTANCE
WHERE  bpm_model.model_id = CASE_INSTANCE.case_model

This is what you get when you look further down into the CIM:

-- Case instances count per state
SELECT case_instance.status        AS STATUS,
       Count(case_instance.status) AS "COUNT"
FROM   case_instance
WHERE  ( case_instance.organization =
         'o=opa_tips,cn=cordys,cn=defaultInst,o=25.1.com'
       )
GROUP  BY case_instance.status ;

Or when you start the further filter down on “ABORTED” state:

-- Count of ABORTED Case instances
SELECT Count(case_instance.status) AS "COUNT"
FROM   case_instance
WHERE  ( case_instance.organization =
         'o=opa_tips,cn=cordys,cn=defaultInst,o=25.1.com'
       )
       AND case_instance.status = 'ABORTED';

An easy one would be an entity with “Is active” Boolean property that we can query like this:

-- Count of "active" case entity instances!
SELECT Count(id) as "COUNT"
FROM   o2opa_tipsprj_genericcase
WHERE  case_is_active = true;

You can go on and on like this and if you use the Count() function it’s a light-weight impact on your database. Let’s continue to see how we can call this in a custom exporter.

A custom exporter

Don’t panic on the word “custom” as it’s easy to do…Especially with a little help from our AI friends. First we need to prepare our VM with some “Python” software and relevant packages:

# Install Python and Pip (if not installed)
sudo dnf install -y python3 python3-pip
# Install PostgreSQL development package on RHEL
sudo dnf install -y postgresql-devel
# Install GCC and related build tools on RHEL
sudo dnf install -y gcc python3-devel
# Install Required Python Packages
sudo pip3 install prometheus_client psycopg2-binary
# Create a nice directory
sudo mkdir /opt/postgres_custom_exporter

Once all is fine, continue creating the script: sudo vi /opt/postgres_custom_exporter/postgres_custom_exporter.py
This is the content (with 2 queries) you can use (and extend yourself!):

from prometheus_client import start_http_server, Gauge
import psycopg2
import time

# Define metric
active_cases = Gauge('app_active_cases_total', 'Total number of active cases in last 5 minutes')
aborted_bpms = Gauge('app_aborted_bpms_total', 'Total number of aborted bpms in last 5 minutes')

# DB connection info
conn_info = {
    'dbname': 'opa_db',
    'user': 'postgres',
    'password': 'admin',
    'host': 'localhost',
    'port': 5432
}

def update_metrics():
    try:
        conn = psycopg2.connect(**conn_info)
        cur = conn.cursor()
        
        #active_cases
        cur.execute("""
            SELECT Count(id) as "COUNT"
            FROM   o2opa_tipsprj_genericcase
            WHERE  case_is_active = true; 
        """)
        active_cases.set(cur.fetchone()[0])
        
        #aborted_bpms
        cur.execute("""
            SELECT Count(process_instance.instance_id) AS "COUNT"
            FROM   process_instance
            WHERE  (process_instance.organization = 'o=opa_tips,cn=cordys,cn=defaultInst,o=25.1.com')
            AND    (process_instance.status IN ('ABORTED'))
        """)
        aborted_bpms.set(cur.fetchone()[0])

        cur.close()
        conn.close()
    except Exception as e:
        print(f"DB error: {e}")
        active_cases.set(0)
        aborted_bpms.set(0)

if __name__ == '__main__':
    start_http_server(7878)
    print("Exporter running on http://localhost:7878/metrics")
    while True:
        update_metrics()
        time.sleep(30)

You see me using hard-coded DB credentials; Good for now, bad for production! Make sure you receive them elsewhere; Like environment variables. AI can generate examples for it…I leave it with you.

You see it’s no rocket science…I don’t have any Python skills and understand what happens! Next step is making it executable and run it:

# Make it executable
sudo chmod +x /opt/postgres_custom_exporter/postgres_custom_exporter.py
# Run it
sudo python3 /opt/postgres_custom_exporter/postgres_custom_exporter.py

You can now access http://192.168.56.107:7878/metrics and check for metric value app_active_cases_total and app_aborted_bpms_total!

With this in place, update your Prometheus scraper config in sudo vi /opt/prometheus/opa_config.yml with a new job custom_postgres_app_metrics and a restart of Prometheus.

global:
  scrape_interval: 15s
  external_labels:
    monitor: 'mon_opa'

scrape_configs:
  - job_name: 'custom_postgres_app_metrics'
    static_configs:
      - targets: ['localhost:7878']
  - job_name: 'job_prometheus'
    scrape_interval: 5s
    static_configs:
      - targets: ['localhost:9090']
  - job_name: 'job_opa'
    scrape_interval: 5s
    metrics_path: '/probe'
    static_configs:
      - targets: ['http://192.168.56.107:8080/home/opa_tips/app/mp/health/ready']
    relabel_configs:
      - source_labels: [__address__]
        target_label: __param_target
      - source_labels: [__param_target]
        target_label: instance
      - target_label: __address__
        replacement: 192.168.56.107:7979

You can now also find app_active_cases_total and app_aborted_bpms_total from a “PromQL” standpoint…AND we can now also use this in Grafana like we saw last week with an end-result like this:

grafana_001

Victory!! 😍

We can of course create a Unix service wrapper around our script, but I leave that task with you (example here)…Again, it’s no rocket science!

You see how easy things can be with a little ReThinking, AI brainstorming, and simply trying the result; It’s a “DONE” where we learned about creating our own metrics exporting in Python, including scraping the data in Prometheus, and consume it further into our Grafana dashboard. Have a great weekend; We continue our journey on alerting with Prometheus (not via Grafana!) next week.

Don’t forget to subscribe to get updates on the activities happening on this site. Have you noticed the quiz where you find out if you are also “The Process Automation guy”?

Interesting queries to play with

A custom exporter

Download 10 steps prototyping booklet

You will receive a mail with a link to download the content

Download 10 steps fundamentals booklet

You will receive a mail with a link to download the content

Plan you strategic "Process Automation" hosting

You will receive a mail with information on next steps

Subscribe

Subscription for this site will get you notified on new posts and updates on activities.

Are you the "Process Automation" guy?

Take this FAST quiz to find out your "Process Automation" score!

I HOPE YOU ANSWERED HONESTLY...

Are you the "Process Automation" guy?

Take this FAST quiz to find out your "Process Automation" score!

Feedback or any question

Send back any feedback or question so we can inspect and adapt our features or help you on your own journey.

Thank you!

Check your mail inbox in the next 5-10 minutes for more information on your request.