Tweaking task throughput
Ok, so during our most recent patch cycles, we've been experiencing delays, meaning patching with the K1000 is taking a lot longer than we expect it to, and we see a lot of jobs queuing up on the appliance.
We currently detect and deploy patches on our Windows servers in batches of 100. There is not much else going on on the K1000 other than standard inventories.
Below is a snippet of the type of activity we see in our konductor log:
[2014-09-18 09:26:22 -0400] Konductor[29185] [main] stats [s:23038 t/s:0 t/tc:8 t:12660 tc:1417 c:11303 cc:2225 sl:15 sc:15512 tpl:40 apa:14 lt:5 lv:4.95]
[2014-09-18 09:26:47 -0400] Konductor[29185] [main] stats [s:23063 t/s:0 t/tc:8 t:12699 tc:1418 c:11341 cc:2226 sl:15 sc:15527 tpl:40 apa:16 lt:5 lv:4.68]
[2014-09-18 09:27:12 -0400] Konductor[29185] [main] stats [s:23088 t/s:0 t/tc:8 t:12737 tc:1419 c:11392 cc:2227 sl:15 sc:15542 tpl:40 apa:17 lt:5 lv:4.27]
[2014-09-18 09:27:29 -0400] Konductor[29185] [main] stats [s:23105 t/s:0 t/tc:8 t:12737 tc:1419 c:11432 cc:2228 sl:16 sc:15557 tpl:40 apa:34 lt:5 lv:6.38]
[2014-09-18 09:27:48 -0400] Konductor[29185] [main] stats [s:23124 t/s:0 t/tc:8 t:12737 tc:1419 c:11452 cc:2229 sl:17 sc:15573 tpl:40 apa:42 lt:5 lv:7.91]
[2014-09-18 09:28:09 -0400] Konductor[29185] [main] stats [s:23145 t/s:0 t/tc:8 t:12737 tc:1419 c:11477 cc:2230 sl:18 sc:15590 tpl:40 apa:32 lt:5 lv:8.16]
[2014-09-18 09:28:28 -0400] Konductor[29185] [main] stats [s:23164 t/s:0 t/tc:8 t:12737 tc:1419 c:11493 cc:2231 sl:19 sc:15608 tpl:40 apa:33 lt:5 lv:9.83]
[2014-09-18 09:28:48 -0400] Konductor[29185] [main] stats [s:23184 t/s:0 t/tc:8 t:12737 tc:1419 c:11511 cc:2232 sl:21 sc:15627 tpl:40 apa:29 lt:5 lv:11.34]
[2014-09-18 09:29:11 -0400] Konductor[29185] [main] stats [s:23207 t/s:0 t/tc:8 t:12737 tc:1419 c:11530 cc:2233 sl:22 sc:15648 tpl:40 apa:27 lt:5 lv:9.58]
[2014-09-18 09:29:34 -0400] Konductor[29185] [main] stats [s:23230 t/s:0 t/tc:8 t:12737 tc:1419 c:11550 cc:2234 sl:23 sc:15670 tpl:40 apa:24 lt:5 lv:9.46]
[2014-09-18 09:29:58 -0400] Konductor[29185] [main] stats [s:23254 t/s:0 t/tc:8 t:12737 tc:1419 c:11577 cc:2235 sl:24 sc:15693 tpl:40 apa:21 lt:5 lv:8.72]
[2014-09-18 09:30:22 -0400] Konductor[29185] [main] stats [s:23278 t/s:0 t/tc:8 t:12737 tc:1419 c:11604 cc:2236 sl:25 sc:15717 tpl:40 apa:20 lt:5 lv:7.61]
[2014-09-18 09:31:08 -0400] Konductor[29185] [main] stats [s:23324 t/s:0 t/tc:8 t:12759 tc:1420 c:11616 cc:2237 sl:25 sc:15742 tpl:40 apa:33 lt:5 lv:4.74]
[2014-09-18 09:31:45 -0400] Konductor[29185] [main] stats [s:23361 t/s:0 t/tc:8 t:12781 tc:1421 c:11649 cc:2238 sl:25 sc:15767 tpl:40 apa:22 lt:5 lv:5.64]
[2014-09-18 09:32:26 -0400] Konductor[29185] [main] stats [s:23402 t/s:0 t/tc:8 t:12781 tc:1421 c:11684 cc:2239 sl:26 sc:15792 tpl:40 apa:33 lt:5 lv:6.93]
[2014-09-18 09:33:17 -0400] Konductor[29185] [main] stats [s:23453 t/s:0 t/tc:9 t:12803 tc:1422 c:11709 cc:2240 sl:26 sc:15818 tpl:40 apa:23 lt:5 lv:5.92]
[2014-09-18 09:34:02 -0400] Konductor[29185] [main] stats [s:23498 t/s:0 t/tc:9 t:12842 tc:1423 c:11769 cc:2241 sl:26 sc:15844 tpl:40 apa:23 lt:5 lv:4.52]
[2014-09-18 09:36:01 -0400] Konductor[29185] [main] stats [s:23617 t/s:0 t/tc:9 t:12879 tc:1424 c:11830 cc:2242 sl:25 sc:15870 tpl:40 apa:41 lt:5 lv:3.43]
[2014-09-18 09:36:30 -0400] Konductor[29185] [main] stats [s:23646 t/s:0 t/tc:9 t:12912 tc:1425 c:11900 cc:2243 sl:24 sc:15895 tpl:40 apa:21 lt:5 lv:3.99]
[2014-09-18 09:36:56 -0400] Konductor[29185] [main] stats [s:23672 t/s:0 t/tc:9 t:12912 tc:1425 c:11969 cc:2244 sl:25 sc:15919 tpl:40 apa:27 lt:5 lv:6.76]
[2014-09-18 09:37:21 -0400] Konductor[29185] [main] stats [s:23697 t/s:0 t/tc:9 t:12912 tc:1425 c:11985 cc:2245 sl:26 sc:15944 tpl:40 apa:29 lt:5 lv:8.57]
[2014-09-18 09:37:47 -0400] Konductor[29185] [main] stats [s:23723 t/s:0 t/tc:9 t:12912 tc:1425 c:11996 cc:2246 sl:27 sc:15970 tpl:40 apa:28 lt:5 lv:9.87]
[2014-09-18 09:38:15 -0400] Konductor[29185] [main] stats [s:23751 t/s:0 t/tc:9 t:12912 tc:1425 c:12009 cc:2247 sl:29 sc:15997 tpl:40 apa:27 lt:5 lv:11.52]
[2014-09-18 09:38:44 -0400] Konductor[29185] [main] stats [s:23780 t/s:0 t/tc:9 t:12912 tc:1425 c:12016 cc:2248 sl:30 sc:16026 tpl:40 apa:25 lt:5 lv:11.36]
[2014-09-18 09:39:14 -0400] Konductor[29185] [main] stats [s:23810 t/s:0 t/tc:9 t:12912 tc:1425 c:12020 cc:2249 sl:30 sc:16056 tpl:40 apa:23 lt:5 lv:10.56]
As you can see, we experience load values higher than our load threshold for extended periods (usually 10 - 20 minutes) of time during patching.
So my question is this -- what should I do? Increase the load threshold? Decrease it? Add more CPU (this is a virtual appliance).
My munin graphs do not suggest that i am CPU bound or short on memory, so I am reluctant, but not opposed to, throwing more resources at this server.
Any advice?
We currently detect and deploy patches on our Windows servers in batches of 100. There is not much else going on on the K1000 other than standard inventories.
Below is a snippet of the type of activity we see in our konductor log:
[2014-09-18 09:26:22 -0400] Konductor[29185] [main] stats [s:23038 t/s:0 t/tc:8 t:12660 tc:1417 c:11303 cc:2225 sl:15 sc:15512 tpl:40 apa:14 lt:5 lv:4.95]
[2014-09-18 09:26:47 -0400] Konductor[29185] [main] stats [s:23063 t/s:0 t/tc:8 t:12699 tc:1418 c:11341 cc:2226 sl:15 sc:15527 tpl:40 apa:16 lt:5 lv:4.68]
[2014-09-18 09:27:12 -0400] Konductor[29185] [main] stats [s:23088 t/s:0 t/tc:8 t:12737 tc:1419 c:11392 cc:2227 sl:15 sc:15542 tpl:40 apa:17 lt:5 lv:4.27]
[2014-09-18 09:27:29 -0400] Konductor[29185] [main] stats [s:23105 t/s:0 t/tc:8 t:12737 tc:1419 c:11432 cc:2228 sl:16 sc:15557 tpl:40 apa:34 lt:5 lv:6.38]
[2014-09-18 09:27:48 -0400] Konductor[29185] [main] stats [s:23124 t/s:0 t/tc:8 t:12737 tc:1419 c:11452 cc:2229 sl:17 sc:15573 tpl:40 apa:42 lt:5 lv:7.91]
[2014-09-18 09:28:09 -0400] Konductor[29185] [main] stats [s:23145 t/s:0 t/tc:8 t:12737 tc:1419 c:11477 cc:2230 sl:18 sc:15590 tpl:40 apa:32 lt:5 lv:8.16]
[2014-09-18 09:28:28 -0400] Konductor[29185] [main] stats [s:23164 t/s:0 t/tc:8 t:12737 tc:1419 c:11493 cc:2231 sl:19 sc:15608 tpl:40 apa:33 lt:5 lv:9.83]
[2014-09-18 09:28:48 -0400] Konductor[29185] [main] stats [s:23184 t/s:0 t/tc:8 t:12737 tc:1419 c:11511 cc:2232 sl:21 sc:15627 tpl:40 apa:29 lt:5 lv:11.34]
[2014-09-18 09:29:11 -0400] Konductor[29185] [main] stats [s:23207 t/s:0 t/tc:8 t:12737 tc:1419 c:11530 cc:2233 sl:22 sc:15648 tpl:40 apa:27 lt:5 lv:9.58]
[2014-09-18 09:29:34 -0400] Konductor[29185] [main] stats [s:23230 t/s:0 t/tc:8 t:12737 tc:1419 c:11550 cc:2234 sl:23 sc:15670 tpl:40 apa:24 lt:5 lv:9.46]
[2014-09-18 09:29:58 -0400] Konductor[29185] [main] stats [s:23254 t/s:0 t/tc:8 t:12737 tc:1419 c:11577 cc:2235 sl:24 sc:15693 tpl:40 apa:21 lt:5 lv:8.72]
[2014-09-18 09:30:22 -0400] Konductor[29185] [main] stats [s:23278 t/s:0 t/tc:8 t:12737 tc:1419 c:11604 cc:2236 sl:25 sc:15717 tpl:40 apa:20 lt:5 lv:7.61]
[2014-09-18 09:31:08 -0400] Konductor[29185] [main] stats [s:23324 t/s:0 t/tc:8 t:12759 tc:1420 c:11616 cc:2237 sl:25 sc:15742 tpl:40 apa:33 lt:5 lv:4.74]
[2014-09-18 09:31:45 -0400] Konductor[29185] [main] stats [s:23361 t/s:0 t/tc:8 t:12781 tc:1421 c:11649 cc:2238 sl:25 sc:15767 tpl:40 apa:22 lt:5 lv:5.64]
[2014-09-18 09:32:26 -0400] Konductor[29185] [main] stats [s:23402 t/s:0 t/tc:8 t:12781 tc:1421 c:11684 cc:2239 sl:26 sc:15792 tpl:40 apa:33 lt:5 lv:6.93]
[2014-09-18 09:33:17 -0400] Konductor[29185] [main] stats [s:23453 t/s:0 t/tc:9 t:12803 tc:1422 c:11709 cc:2240 sl:26 sc:15818 tpl:40 apa:23 lt:5 lv:5.92]
[2014-09-18 09:34:02 -0400] Konductor[29185] [main] stats [s:23498 t/s:0 t/tc:9 t:12842 tc:1423 c:11769 cc:2241 sl:26 sc:15844 tpl:40 apa:23 lt:5 lv:4.52]
[2014-09-18 09:36:01 -0400] Konductor[29185] [main] stats [s:23617 t/s:0 t/tc:9 t:12879 tc:1424 c:11830 cc:2242 sl:25 sc:15870 tpl:40 apa:41 lt:5 lv:3.43]
[2014-09-18 09:36:30 -0400] Konductor[29185] [main] stats [s:23646 t/s:0 t/tc:9 t:12912 tc:1425 c:11900 cc:2243 sl:24 sc:15895 tpl:40 apa:21 lt:5 lv:3.99]
[2014-09-18 09:36:56 -0400] Konductor[29185] [main] stats [s:23672 t/s:0 t/tc:9 t:12912 tc:1425 c:11969 cc:2244 sl:25 sc:15919 tpl:40 apa:27 lt:5 lv:6.76]
[2014-09-18 09:37:21 -0400] Konductor[29185] [main] stats [s:23697 t/s:0 t/tc:9 t:12912 tc:1425 c:11985 cc:2245 sl:26 sc:15944 tpl:40 apa:29 lt:5 lv:8.57]
[2014-09-18 09:37:47 -0400] Konductor[29185] [main] stats [s:23723 t/s:0 t/tc:9 t:12912 tc:1425 c:11996 cc:2246 sl:27 sc:15970 tpl:40 apa:28 lt:5 lv:9.87]
[2014-09-18 09:38:15 -0400] Konductor[29185] [main] stats [s:23751 t/s:0 t/tc:9 t:12912 tc:1425 c:12009 cc:2247 sl:29 sc:15997 tpl:40 apa:27 lt:5 lv:11.52]
[2014-09-18 09:38:44 -0400] Konductor[29185] [main] stats [s:23780 t/s:0 t/tc:9 t:12912 tc:1425 c:12016 cc:2248 sl:30 sc:16026 tpl:40 apa:25 lt:5 lv:11.36]
[2014-09-18 09:39:14 -0400] Konductor[29185] [main] stats [s:23810 t/s:0 t/tc:9 t:12912 tc:1425 c:12020 cc:2249 sl:30 sc:16056 tpl:40 apa:23 lt:5 lv:10.56]
As you can see, we experience load values higher than our load threshold for extended periods (usually 10 - 20 minutes) of time during patching.
So my question is this -- what should I do? Increase the load threshold? Decrease it? Add more CPU (this is a virtual appliance).
My munin graphs do not suggest that i am CPU bound or short on memory, so I am reluctant, but not opposed to, throwing more resources at this server.
Any advice?
0 Comments
[ + ] Show comments
Answers (0)
Please log in to answer
Be the first to answer this question