/ Management  

Boost your debugging skills today; Unlock the secrets of jstack and HAR files

Hi there AppWorks fans,

Welcome to a new installment of AppWorks tips.

At a customer we hit the wall with a performance issue on our beloved AppWorks platform. The exact reason is always hard the tackle and mostly is a combination of different layers; Like the platform itself, the OS, the TomEE webapp server, the database, the network, the solution, the traffic, the concurrent users, anything is open for discussion. This also happens with lots of pressure from top management (as expected) and under pressure a lot of things get fluid AND with this, all the experts join the meeting AND that’s exactly where the finest tools and tricks expose their power to solve the problem. Tools out of the box, but also custom-made tools we didn’t know exist already! Interesting…

I will not dive into any details on the performance issue at the customer, but I do want to share some tricks that can help you as well. It’s also new stuff to me, it is pretty techie, but worth the share…


Let get right into it…

Before we dive into the tooling, we first want to introduce a performance hit on our platform and the first thing I can think of is a simple infinite-true-loop on a short-lived BPM that needs to send back a synchronized message into runtime on initialization (a ‘Rule’) for an entity creation! 😁

Don’t do this stuff at your customer; I’m in a sandbox VM to try out these kinds of features!

So, boost your VM, open the corresponding URL for designing our project in the relevant workspace, and start with a first case entity. Use a simple case_name property and make it nice and shiny (let me know in the comment if you need more guidance…). This is my starting point:

jstack_001

Next, we create a new ‘Rule’ building block of type ‘event’ with the fancy name e_oninit_start_bpm (or e_oi_start_bpm). You can do a first implementation like this:

jstack_002

An interesting naming convention can be like this (have a comment below…):

  • onInit = e_oi_
  • onCreate = e_oc_
  • onPropChange = e_opc_
  • onRelChange = e_orc_
  • onDelete = e_od_
  • onShare = e_os_
  • onUnShare = e_ous_

The BPM bpm_oninit_case (saved in the bpms folder of the project) for that rule will get a first basic implementation like this:

jstack_003

To extend the BPM, we start with the basic change on process level making it a short-lived process (this way we get direct feedback in runtime):

jstack_004

Have a look at the changed ‘Monitoring’ settings which we (at our project) always reset to this default setting (again, have a comment…):

jstack_005

Next, we implement the decision split with a solid “exclusive” type of choice to go left/right:

jstack_006

That’s it…The horror BPM is ready (incl. the rest of the project) for a publication into runtime!


HTTP Archive (HAR)

Before triggering the creation of a new ‘Case’ entity, we first make sure to put some developer tools <F12> in Chrome in place.

jstack_007

That’s in the ‘Network’ tab for ‘All’ type of requests with a ‘Preserve log’ and a ‘Clean’ before we start. Do a first refresh of the screen; Just to get a first result which should be fast as nothing fancy happens…right?

Next is clicking the ‘Create’ icon to instantiate a new ‘Case’…

jstack_008

Your laptop fan will start to blow telling me we’re in the endless loop!

Let’s now first right-click the network traffic to get an option to “Save all as HAR with content”; This is my copy. What can we find here? Well, this file calls itself “HTTP ARchive” file; simply an export of the actions and corresponding requests you (as an end-user) did. Why is this useful? Well, sending this information to your administrator can inject the person with information to solve a certain performance problem. Will we find something in this JSON-data file for our issue? Well, in our case NOT as our last request is still “pending” (without a response to save!) and the rest is straightforward.

Still, it’s an interesting file; even more so when the total timings increase on the responses. That’s an indication something is wrong in the backend, or at least in the communication to the backend from the perspective of your end-user. Could be someone sitting/standing on the cat6-network cable? 🤣 (That’s an IT joke!)

What I also found is an interesting online tool to have this file analyzed:

jstack_009

Nicely ordered by ‘Total time’ and viewing the error responses. Pick the fruits as we’ll continue to the next step to find our performance hit!


Java “jstack” and “jcmd” (in /bin)

Before we continue, let’s first agree on some ground rules when starting any tool:

  1. Run the same tool twice, once as a baseline measurement and once when the issue occurs. This way you can do a comparison between the two which makes things clearer.
  2. Make sure to restart your TomEE instance for a fresh empty-garbage measurement and fresh PID.
  3. Do this test outside office hours when the “regular” end-users have gone home already. That’s also the time when a “Cleaner” (Yeah, I read the book!) would strike; Just under the radar when the rest is sleeping!
  4. Limit access to the UI (with communication) for those smart people who sleep at the office!

Rules are in place, let’s restart TomEE first systemctl restart tomee to get rid of our fan-blowing VM for our endless BPM loop! Interesting to know (and maybe you didn’t notice), but from a new browser tab you can still access the UI!?…Well, that’s the power of multi-threading! Is it endless? Nope…I tried six browser tabs; each starting an instance of the ‘Case’, and the seventh kept white with the circling icon of the platform! Eventually, I did a reboot sudo shutdown -r now. 🤠

Other great reads on this threading stuff:

Once the VM is available again, we start with the baseline measurement. As input, we need the Process IDentifier (PID) of our TomEE instance. From a Linux perspective, you have several options:

1
2
3
4
systemctl status tomee
ps -aux | grep tomee
pgrep -fa tomee
sudo cat /opt/tomee/latest/temp/tomee.pid

With this PID ready, we start the first command for our baseline (ground-rules are still in place…correct?): sudo jstack {PID} > ~/threaddump_baseline.dump

Now open your runtime with (let’s say) 3 tabs and do your tricks again on the performance issue AND quickly after triggering (and your fan starts to blow again) you do a second dump with: sudo jstack {PID} > ~/threaddump_issue.dump.

Here are my dump files:

You can start reading them but be a wise man and use a tool like “jstack.review” for this AND do a comparison:

jstack_010

The image directly tells me instantly something gets a lock at HTTP level…dûh! You can click on the IDs to further investigate; you see great insights with stacktrace information on what is going on during your actions.

Finally, I also found out that jstack is old. The new jstack has the name jcmd; It can output an equal dump with a slightly different command:

1
2
sudo jcmd {PID} Thread.print > ~/threaddump_baseline.dump
sudo jcmd {PID} help Thread.print

At the end of this post there is a list for other input next to Thread.print


A great “DONE” where we learned about some fascinating analytical tools for our AppWorks runtime UI. This of course works for any JVM and web UI type of interface and that is why you should be aware of the tooling. I was only aware of remote debugging on the JVM, but I’ll add these tools to my list as well. Have fun with it and I will see you in another great post on AppWorks Tips (don’t forget the list below to play with in your own spare time). 🍺

Don’t forget to subscribe to get updates on the activities happening on this site. Have you noticed the quiz where you find out if you are also “The AppWorks guy”?

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
Compiler.CodeHeap_Analytics
Compiler.codecache
Compiler.codelist
Compiler.directives_add
Compiler.directives_clear
Compiler.directives_print
Compiler.directives_remove
Compiler.perfmap
Compiler.queue
GC.class_histogram
GC.finalizer_info
GC.heap_dump
GC.heap_info
GC.run
GC.run_finalization
JFR.check
JFR.configure
JFR.dump
JFR.start
JFR.stop
JVMTI.agent_load
JVMTI.data_dump
ManagementAgent.start
ManagementAgent.start_local
ManagementAgent.status
ManagementAgent.stop
System.trim_native_heap
Thread.print
VM.cds
VM.class_hierarchy
VM.classloader_stats
VM.classloaders
VM.command_line
VM.dynlibs
VM.events
VM.flags
VM.info
VM.log
VM.metaspace
VM.native_memory
VM.print_touched_methods
VM.set_flag
VM.stringtable
VM.symboltable
VM.system_properties
VM.systemdictionary
VM.uptime
VM.version