{"id":3802,"date":"2021-02-05T18:47:23","date_gmt":"2021-02-06T00:47:23","guid":{"rendered":"http:\/\/benincosa.com\/?p=3802"},"modified":"2021-02-05T18:47:23","modified_gmt":"2021-02-06T00:47:23","slug":"prometheus-service-monitors","status":"publish","type":"post","link":"https:\/\/benincosa.com\/?p=3802","title":{"rendered":"Prometheus Service Monitors"},"content":{"rendered":"\n<p>Prometheus is confusing.  It&#8217;s such a great project and there is all kinds of information out there, but it&#8217;s taken me a bit of legwork to understand it.  <\/p>\n\n\n\n<p>The first issue is: How are you going to install it?  Since I&#8217;m running this on Kubernetes it makes sense to use whatever most people are doing.  And commonly, the answer is Helm.  I&#8217;m not a big fan of Helm because of all the added features like releases and heaviness in setting it up.  I would instead prefer a light weight variable substitution system with jija2 and a bunch of manifest files but fine, helm is what we&#8217;ll use.  <\/p>\n\n\n\n<p>Now you have to find the right repo.  So we do the following:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>helm repo add prometheus-community https:\/\/prometheus-community.github.io\/helm-charts<\/code><\/pre>\n\n\n\n<p>  You wanted prometheus so it&#8217;s in there right? <\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>helm search repo prometheus-community\nNAME                                              \tCHART VERSION\tAPP VERSION\tDESCRIPTION\nprometheus-community\/alertmanager                 \t0.5.0        \tv0.21.0    \tThe Alertmanager handles alerts sent by client ...\nprometheus-community\/kube-prometheus-stack        \t13.5.0       \t0.45.0     \tkube-prometheus-stack collects Kubernetes manif...\nprometheus-community\/prometheus                   \t13.2.1       \t2.24.0     \tPrometheus is a monitoring system and time seri...\nprometheus-community\/prometheus-adapter           \t2.11.1       \tv0.8.3     \tA Helm chart for k8s prometheus adapter\nprometheus-community\/prometheus-blackbox-exporter \t4.10.2       \t0.18.0     \tPrometheus Blackbox Exporter\nprometheus-community\/prometheus-cloudwatch-expo...\t0.13.0       \t0.8.0      \tA Helm chart for prometheus cloudwatch-exporter\nprometheus-community\/prometheus-consul-exporter   \t0.4.0        \t0.4.0      \tA Helm chart for the Prometheus Consul Exporter\nprometheus-community\/prometheus-couchdb-exporter  \t0.2.0        \t1.0        \tA Helm chart to export the metrics from couchdb...\nprometheus-community\/prometheus-druid-exporter    \t0.9.0        \tv0.8.0     \tDruid exporter to monitor druid metrics with Pr...\nprometheus-community\/prometheus-elasticsearch-e...\t4.1.0        \t1.1.0      \tElasticsearch stats exporter for Prometheus\nprometheus-community\/prometheus-kafka-exporter    \t1.0.0        \tv1.2.0     \tA Helm chart to export the metrics from Kafka i...\nprometheus-community\/prometheus-mongodb-exporter  \t2.8.1        \tv0.10.0    \tA Prometheus exporter for MongoDB metrics\nprometheus-community\/prometheus-mysql-exporter    \t1.0.1        \tv0.12.1    \tA Helm chart for prometheus mysql exporter with...\nprometheus-community\/prometheus-nats-exporter     \t2.5.1        \t0.6.2      \tA Helm chart for prometheus-nats-exporter\nprometheus-community\/prometheus-node-exporter     \t1.14.2       \t1.0.1      \tA Helm chart for prometheus node-exporter\nprometheus-community\/prometheus-operator          \t9.3.2        \t0.38.1     \tDEPRECATED - This chart will be renamed. See ht...\nprometheus-community\/prometheus-pingdom-exporter  \t2.3.2        \t20190610-1 \tA Helm chart for Prometheus Pingdom Exporter\nprometheus-community\/prometheus-postgres-exporter \t1.9.0        \t0.8.0      \tA Helm chart for prometheus postgres-exporter\nprometheus-community\/prometheus-pushgateway       \t1.7.0        \t1.3.0      \tA Helm chart for prometheus pushgateway\nprometheus-community\/prometheus-rabbitmq-exporter \t0.6.0        \tv0.29.0    \tRabbitmq metrics exporter for prometheus\nprometheus-community\/prometheus-redis-exporter    \t4.0.0        \t1.11.1     \tPrometheus exporter for Redis metrics\nprometheus-community\/prometheus-snmp-exporter     \t0.1.1        \t0.19.0     \tPrometheus SNMP Exporter\nprometheus-community\/prometheus-stackdriver-exp...\t1.8.0        \t0.11.0     \tStackdriver exporter for Prometheus\nprometheus-community\/prometheus-statsd-exporter   \t0.2.0        \t0.18.0     \tA Helm chart for prometheus stats-exporter\nprometheus-community\/prometheus-to-sd             \t0.4.0        \t0.5.2      \tScrape metrics stored in prometheus format and ...<\/code><\/pre>\n\n\n\n<p>So yeah, there&#8217;s a lot to chose from here.  I decided to just go with the simple all in one helm chart because it had grafana, alertmanager and all the things I think I&#8217;ll need at some point.  Maybe it&#8217;s a lot of bloat, but we had to start somewhere. <\/p>\n\n\n\n<p>Next thing you&#8217;ll start thinking is: Hey, I need to customize this.  So I started with grafana.  <\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Grafana Config<\/h2>\n\n\n\n<p>There are so many things to customize I didn&#8217;t even scratch the surface. My biggest change here was adding sidecars to add my own custom notifiers (slack), dashboards (one&#8217;s we created), and data sources (Elasticsearch).  (It grabs prometheus automatically).  <\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>...\ngrafana:\n  sidecar:\n    enabled: true\n    dashboards:\n      label: grafana_dashboard\n      labelValue: 1\n      searchNamespace: monitoring\n    datasources:\n      label: grafana_datasource\n      labelValue: 1\n      searchNamespace: monitoring\n    notifiers:\n      enabled: true\n      label: grafana_notifier\n      labelValue: 1\n      searchNamespace: monitoring<\/code><\/pre>\n\n\n\n<p>My dashboards were then just configmaps.  You can use secrets as well to be more secure but I had no real secret configs in there, so it looked something like this: <\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>apiVersion: v1\nkind: ConfigMap\nmetadata:\n  name: super-dash\n  namespace: monitoring\n  labels:\n    grafana_dashboard: \"1\"\ndata:\n  super-dash.json: |-\n        { &lt;some json> }<\/code><\/pre>\n\n\n\n<p>Notice the <code>label<\/code> value and name has to match the respected <code>label<\/code> in the config.<\/p>\n\n\n\n<p>The other tricky thing is you need to deploy these dashboards, datasources, and notifiers before grafana loads so it can make use of it.  In my circle ci make file I added the command: <\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>kubectl rollout restart -n monitoring deployment kube-prom-grafana<\/code><\/pre>\n\n\n\n<p> That way if the dashboards are updated we&#8217;ll automatically roll into the change.  <\/p>\n\n\n\n<p>After adding all the other configurations I set my eyes on Prometheus. <\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Prometheus Config<\/h2>\n\n\n\n<p>  In the same <code>values.yaml<\/code> file I used with helm I didn&#8217;t really change much.  I found that I could create <a href=\"https:\/\/github.com\/prometheus-operator\/prometheus-operator\/blob\/master\/Documentation\/user-guides\/getting-started.md#include-servicemonitors\">Service Monitors<\/a>. The idea of service monitors is that you watch the Kubernetes cluster for changes and if something new comes along you can start monitoring it.  For example, I created a service monitor that looked like this: <\/p>\n\n\n\n<pre class=\"wp-block-code\"><code># monitor all apps with metrics port in the wg namespace.\napiVersion: monitoring.coreos.com\/v1\nkind: ServiceMonitor\nmetadata:\n  name: app\n  labels:\n    # Make sure this matches your helm release\n    release: kube-prom\n  namespace: app\nspec:\n  selector:\n    matchLabels:\n      name: myapp\n  endpoints:\n  - port: metrics<\/code><\/pre>\n\n\n\n<p>This was pretty frustrating to get working.  So let me clarify what issues you&#8217;ll have doing this and how to fix it. <\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Getting the Service Monitor Picked up<\/h3>\n\n\n\n<p>I tried this first a dozen times and never saw anything in my Prometheus dashboard saying it was getting anything.  It didn&#8217;t even update my config.  I then stumbled across this <a href=\"https:\/\/stackoverflow.com\/questions\/60706343\/prometheus-operator-enable-monitoring-for-everything-in-all-namespaces\">post<\/a>.  I then realized the best way for me (in my case) to have the operator pick up my service monitor was to include the label <code>release: kube-prom<\/code> The reason for this was that I installed my helm chart with the release name being <code>kube-prom<\/code>.  So that was the first &#8216;ah hah!&#8217; moment.  Then I could finally see it in the app. <\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Namespace<\/h3>\n\n\n\n<p>My apps are all running in the app kubernetes namespace, so I had to deploy it to the <code>app<\/code> namespace for it to find anything.  I was before deploying to the <code>monitoring<\/code> namespace with the rest of my helm chart.  Once I set it in the namespace where the service was running I could find it.  Now this may sound really obvious to you but I thought for some reason it would just find it in all name spaces.  Guess not.  <\/p>\n\n\n\n<h3 class=\"wp-block-heading\">On to Victory<\/h3>\n\n\n\n<p>Finally bringing it up I saw it automatically discover any app that had the metrics endpoint.  Super rad. Previous to this we had been hard coding all the values in the config.  Now we pick up new services automatically and get everything for free.  <\/p>\n\n\n\n<p>I&#8217;m really impressed with all the capabilities in these Prometheus operators and glad I finally got to really kick the tires on it this week. Now with a better understanding I&#8217;m hoping I can help our application team do a ton more with understanding the system. <\/p>\n","protected":false},"excerpt":{"rendered":"<p>Prometheus is confusing. It&#8217;s such a great project and there is all kinds of information out there, but it&#8217;s taken me a bit of legwork to understand it. The first issue is: How are you going to install it? Since I&#8217;m running this on Kubernetes it makes sense to use whatever most people are doing&#8230;.<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":[],"categories":[797],"tags":[],"jetpack_featured_media_url":"","_links":{"self":[{"href":"https:\/\/benincosa.com\/index.php?rest_route=\/wp\/v2\/posts\/3802"}],"collection":[{"href":"https:\/\/benincosa.com\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/benincosa.com\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/benincosa.com\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/benincosa.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=3802"}],"version-history":[{"count":1,"href":"https:\/\/benincosa.com\/index.php?rest_route=\/wp\/v2\/posts\/3802\/revisions"}],"predecessor-version":[{"id":3803,"href":"https:\/\/benincosa.com\/index.php?rest_route=\/wp\/v2\/posts\/3802\/revisions\/3803"}],"wp:attachment":[{"href":"https:\/\/benincosa.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=3802"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/benincosa.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=3802"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/benincosa.com\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=3802"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}