Ep. 49: Monitoring an Oracle database with Checkmk

[0:00:00]	Welcome to the Checkmk Channel. Today I'll be talking about monitoring an Oracle database with Checkmk.
[0:00:15]	In this tutorial I will show you how to set up the Oracle monitoring with the Agent Bakery and we will also have a look at the command line, wherein I will show you how to run the agent plugin manually, look at the different command line options that it offers and also look at the configuration file. So without further Ado let's get started.
[0:00:32]	As a first step we will create a user in the Oracle database which will be used for monitoring to create that user we can directly go to our Oracle documentation where we can find this information in the chapter number two and here we have "Create a user".
[0:00:57]	For this demo I'm using a multi-tenant database and in order to create a user for that I will have a look here under multi-tenant databases.
[0:01:07]	Here there are some commands that I need to execute against my Oracle database so let's copy these commands. And now go to the Oracle database let's copy this replace it with a password where I'm keeping the password as Secret.
[0:01:33]	And then we can copy the other commands which says "alter user c##checkmk set container_data=all". And then do some execute some Grant statements which is going to assign a select catalog role to this user and also a "select create session privilege" to this user. And after that we can just do a simple exit.
[0:02:13]	So we have created the user successfully, now we can go to the monitoring web UI in order to do the monitoring configuration for the Oracle database.
[0:02:24]	As a first step we will be creating a host which we will be using for monitoring. So let's click on: Add host to monitoring. Define the hostname, Define the IP address and rest the data source which is selected as API integration if configured else Checkmk agent no SNMP can remain as it is. We can just simply save and go to folder, after that we can click on setup search for Oracle. Under agent rules you can select Oracle databases.
[0:02:58]	Click here click on "Add rule". Click on "Activation" because we want to deploy the Oracle database agent plugin click on host if you are using xinetd then select xinetd in my case I'm using systemd. So I will select this from the drop-down my interval will remain as one minute. This is primarily required for running asynchronous sections which we will also have a look later when I show you the configuration files.
[0:03:40]	Then I will choose my login credentials, choose the authentication method. Define my user that I created my user was c##checkmk and the password was Secret.
[0:03:53]	The role was a normal connection but you could also select these different roles from the drop down, click on host name since I'm deploying this agent plugin and configuration on my Oracle database so it can stay as localhost but if it's a different name hostname or IP then you need to Define it here. Click on the TCP port for listener my default Port is 1521 so I will keep it as it is.
[0:04:24]	If you want to define a TNS alias you could also define it here I will leave this field as blank. If you have multiple databases and you want to define the authentication credentials for those SIDs that you could also define it here in my case I just have one single database so I will leave it as it is.
[0:04:51]	But otherwise you could keep on adding new elements and define the different authentication credentials for your different SIDs. Next option is login for ASM if you have ASM enabled on your database then you could also define the credentials here it's the same settings where you define an authentication method, you define a role hostname TCP port and TNS alias.
[0:05:22]	Next setting is instances to monitor where you can select if you would like to monitor all your interfaces only query the following SIDs or skip some of the SIDs that are found. In my case I'll just select monitor all instances found. You could also add a pre or post fix to your TNS aliases here. since I don't need the setting I will just skip it and then the interesting setting is "Sections", what are the sections that you would like to fetch or from your agent plugin.
[0:06:02]	You can see that some sections are configured to run synchronously and some are set to be "Run asynchronously and cached", this setting can also be altered by selecting the right option some sections can be disabled.
[0:06:17]	If you would like to disable it for a particular SID then you could also exclude some sections on certain instances like, define the name of the SID and then choose which sections should be excluded for that particular SID. I will go with the default settings so I won't select any sections if you don't select any options then the sections which are supposed to be run synchronously and asynchronously are configured by default which we will also look in the CFG file which gets generated from this rule. Next option is the cache age for background checks the mk_oracle plugin or the agent plugin, it has its own caching mechanism the default is 600 seconds that's 10 minutes.
[0:07:13]	And this cache age can also be changed but if you provide a different number here. I will keep the default values so we'll not select anything. Sqlnet send timeout is another setting with which you can select a timeout, it's selected as 30 seconds since I don't need it I will also skip it. Okay the next two settings that is Remote instances and ORACLE_HOME to use for remote access, this is primarily required when you want to monitor Oracle database without installing the agent plugin on it.
[0:07:54]	And this is something that we will cover in the next tutorial. And last but not the least is the TNS_ADMIN, this is basically the path that you will define where you place your sqlnet.ora and tnsnames.ora. If you leave this field blank then it will by default /etc/check_mk. So we are done with the configuration and now I'm going to assign this to the host that I created and I will click on Save.
[0:08:32]	After that, before we activate the change we could also go to setup press Windows, Linux click on bake agents. And once our agent is baked we can then activate the change. Go to setup again Windows Linux and then download the package that we installed. In my case my Oracle database server is running on a red hat server so I'll download an RPM package.
[0:09:12]	And we will now copy this package to our Oracle server. So let's quickly go to the command line. And look for the file.
[0:09:30]	That's the file we will now copy this over to our Oracle server. All right we can now have a look at our SQL server. Switch here. So this is the file that we just now copied over, let's copy this name and install the package.
[0:10:05]	As you can see the package has been installed, let's have a look into the /etc/check_mk folder where our configuration files are placed. As you can see we now have an mk_oracle.cfg file. If we look at the folder /uer/lib/check_mk_agent/plugins/.
[0:10:34]	We now have the folder 60 which you might have noticed when we selected the systemd. The interval was set to 60 seconds so this is gonna help the long running queries and the asynchronous checks to be executed successfully. If we look under the folder 60 this is where our plugin is available. Now to run the plugin we can just copy this path go to this and after that we can just simply export some environment variables which will be required by the plugin.
[0:11:13]	This is a MK_CONFDIR = /etc/check_mk/. The other one is, MK_VARDIR, which basically contains the cache information from this uh from this plugin and that's it. And now we can run the plugin.
[0:11:49]	First of all we can run the plugin with the '-t' option this will be like a connection test if we run it we will see the information directly displayed on the console. As an initial output there will be empty sections and after that it will show you depending on how many instances you have configured it will show you the login information. Here it shows that the operating system is Linux 'ORACLE_HOME' that was detected from the ORA tab.
[0:12:24]	What is the login check to this instance, the version, the user information, sync sections and the async sections. So when I talked about those sections that we can select synchronously and asynchronously I was talking about this configuration and you can also have a look at the /etc/check_mk/mk_oracle.cfg or we can also open the file.
[0:12:54]	Here as you can see that this file was created by the checkmk agent bakery, under connection and authentication we see the DBUSER wherein we have our user the password ,the role is blank, it's a normal connection, this is the host name, this is the port, and last is the value for the version.
[0:13:16]	Then you have a default setting for ASMUSER. Since we are not using it so this can stay as it is. Then comes the sync sections and the async sections. So, sync sections are the ones which run in foreground and wait for the result. These are the sections which you can see are run synchronously.
[0:13:39]	And then there are async sections which where basically the queries can take a longer time so these sections are run in the background and they run at a slower interval and then at the bottom we have something about ASM and async. Since we're not using it we can ignore it and then we have something about the background the cache age which we set up as 600 seconds for the async sections.
[0:14:07]	All right let's go out of this configuration file. We already performed the connection test and it was successful. Now we can straight away remove this and we can also have a look at what are the other different command line options available with the mk_oracle plugin. If we just type 'mk_oracle --help' we can have a look at all the options.
[0:14:34]	Here you have a '-t' with which you can enable the full bash debugging, you can check the connection like I did the connection test, you can also call this plugin with the '-l' option which basically stores the results in a log file. You can also call a corresponding section or you could also perform an 'oratestversion'.
[0:14:59]	And here it also talks about different sections that are available under the configuration, sync sections async sections, information about the ASM, the cache age, if you are only interested in specific SIDs that you could also directly specify inside this file, but please make sure that you have set it up by the bakery, so if automatic agent update is configured then these changes will be overwritten.
[0:15:30]	Like this there are settings for the remote instances which will be configuring in the next tutorials and then there is information about how you could call Custom SQL queries this will also be covered in the next tutorials. And all the other settings are related to the custom SQLs. So now we can just execute the plugin without any special option. If we run this plugin we will see first of all it displays some empty sections, if you want to look at the output and let's see what information we get.
[0:16:14]	When you run the agent plugin for the first time you will see some empty sections in the beginning which is perfectly fine and after that you will see some information being populated against different sections. As you can see we get information about Oracle instances, the Oracle performance metrics and we can also just look for this string Oracle system parameters Oracle processes, what are the sessions, long active sessions, logswitches, undostat, what is the recovery status, information about the Oracle logs tablespaces, in some of the sections you also see the word "cached" this is basically your asynchronous checks.
[0:16:55]	If you remember we we had two sections "sync_sections ", "async_sections". And the Oracle table spaces is part of the async sections and this is why these results are cached.
[0:17:13]	Then you have the Oracle jobs and then it's finished after Oracle resumes. We can also have a look at the cache file, it will be generated under '/var/check_mk_agent/cache/' folder. There will be two files that will be generated when you use systemd because you run the plugin with the 60 Minutes of interval and on top of that you also run some async sections. So if we look at this file "plugins_mk_oracle.cache" you will see the complete information about all the sections, but if you look at this particular file which is "oracle_proddb.cache".
[0:18:08]	You will see only information about the async sections. Alright so now we can go back to the monitoring UI and perform the service discovery. All right let's click on the Oracle host. All right here it shows that communication has failed this is because I have not registered the agent against this monitoring instance. So let's go back to the command line and do the registration.
[0:18:47]	It should be done by the 'cmk-agent-ctl' command, let's register the host it will be the oracle-host with the with the server that is going to be my 10.203.8 which will be my demo instance, this will be the mysite the user and we will fill out the password when it asks for it. So if you just press enter you have to establish the connection, you simply press yes and then type the password.
[0:19:28]	And after that you will see the screen, if you just now type cmk-agent-ctl status, you will see now the agent has been registered against this monitoring server on Port 8002 and the site is demo. So now let's go back to the monitoring web UI and perform the discovery.
[0:19:53]	So if we now reschedule this check you will see it will now try to get some information about the services that it found. It already found 68 unmonitored Services which is a mix of your basic Linux metrics and also the Oracle checks that are provided by the mk_oracle agent plugin.
[0:20:18]	Click on host, click on service configuration, we can already see this information here under the undecided services. In the beginning you will definitely see your basic Linux metrics like memory CPU, Filesystem, Mount points and then you will see information about your Oracle instances.
[0:20:45]	This is my CDB name and I get some parameters and on the right-hand side I also see which check plugin is responsible for creating this service check. This can also be activated by using show plugin names, you can also hide it if you don't want to see these different columns. And then we get all the other information about my Oracle database like jobs, get information about the sessions, the different performance metrics, information about my table space, and now we can accept all the changes. And last but not the least let's activate the change.
[0:21:33]	Let's give it some time or you can also just simply reschedule the check. So now you will see all your Oracle service checks available for your Oracle database which were created by the Oracle agent plugin.
[0:21:53]	So that was our tutorial about monitoring an Oracle database. Please like and subscribe and see you next time.