A J Thompson | Andrew Thompson, Rockhampton Australia

Identifying the source of lag on an OpenSimulator varregion

First published 25 May 2015 ~ Page moved here 13 July 2016

I'm writing this piece on finding the cause of lag in OpenSim for the benefit of other grid or region owners.

Xay Tomsen at Aratura City 2016

When I Googled for tips myself, I found very little information to help me solve my problem, so hopefully others will stumble upon this article in the future and find it beneficial.

Whether you're a grid owner like me, or an estate owner like Leah, who runs the Kiwi Island estate on one of my worlds, the last thing you want to hear is someone say 'Damn this sim is laggy'.

For the past few days, poor old Kiwi Island has experienced lag in spades.

The region reboots, lags up straight away, then within five minutes, the restart warning flashes on the screen.

If you're very lucky, you might get ten minutes, but don't try terraforming or you can expect an immediate two minute countdown to evacuate.

The latter pointed towards a memory leak - The region wasn't coping with changes to its terrain - but again, what does that really tell you? The backend systems say very little except that there's an error causing a memory leak. Identifying the cause of that 'error' is basically the same as solving the cause of your 'lag' problem. It's all different ways of saying the same thing.

Identifying the cause of lag on any grid is a pain. In this case it was an even greater pain, in that:

I went through the list, starting with the more obvious potential culprits. Eventually, two days later, I solved it. But instead of sitting here now feeling very smug and self-satisified, I feel like a total idiot for not seeing the problem sooner.

Chances are though, that if you're reading this page, you're having serious lag problems as well and can't find the cause, so let's go through the elimination process I went through, and that you should go through too.

Frames per second & Packet Loss .:. lag in OpenSim

Assuming you're using the Firestorm viewer, go up to the little flashy bar in the top right corner of your screen and click on it.

The Statistics Window opens. Packet loss should be 0%. If it isn't, you have a network congestion problem. Maybe it's a temporary thing due to high demand on the internet at this particular time, or it could be a problem in your own network.

Next, scroll down to 'Simulator' and look at Sim FPS. Over 40 is OK, over 50 is great. Anything less than 30, it's worth another look at your operating system, RAM, and configuration.

Laggy scripts .:. lag in OpenSim

Scripts make things happen in virtual worlds. They bring the world alive and are necessary, but sometimes they aren't written particularly well.

To locate a problematic script, go into your viewer Estate Debug Tools. Click the Get Top Scripts button. Scroll through the list. Does anything stand out as excessively time intensive? If there is, disable that script and see if it makes a difference to sim performance.

It is worth noting that the debug tools won't pick up every script, so this isn't a magic bullet to find all your lag-inducing scripts. Personally, I view the debug tools as a guide, nothing more.

Physical Objects & Collisions .:. lag in OpenSim

Physical objects are any that collide with the region's terrain or other non-phantom prims. Physical objects put load on the region's performance.

Vehicles and bots (NPCs or non-player characters) are often the main contenders for causing lag that is induced by physical objects colliding. Another is scripted objects that rotate or move, which the creator could have easily made phantom, but did not.

That said, don't bother using the 'Get Top Colliders' button in your debug tools - It isn't supported for OpenSim.

To give you an idea how physical objects affect performance, here's a case I saw years ago in Second Life, where a creator had made coconut trees that dropped a coconut every 30 seconds which would roll gracefully down into the sea.

Very pretty, however the coconuts weren't scripted to self-delete. They just built up and built up, creating hundreds of physical objects adding to the burden of the region. Multiply that by 50 or so coconut trees on the island. Eventually the region would crash, restart, then instantly crash again.

If you use a script that auto-generates physical objects, make sure that those objects have a self-delete timer built in.

Vehicles .:. lag in OpenSim

Boats and cars in OpenSim

In the case of vehicles, they are designed to be used temporarily. The action of using a vehicle has a very small impact on sim performance, which is fine, again because it's temporary.

Once you and your passengers get out of the vehicle, the impact on sim performance returns to nil.

Vehicles generally don't cause region performance issues unless a) there are dozens of them in use at any given time, or b) they don't change back to non-physical when not in use.

Boats and cars must return to a non-physical state once their driver and passengers have alighted.

If you see a car rolling about on its own with no driver, or a boat floundering at the bottom of the sea, the vehicle is still physical.

That vehicle is causing massive lag. One vehicle bouncing around of its own accord is all it takes to crash a sim. Delete it and fix its scripts.

Bots (NPCs) .:. lag in OpenSim

In the case of a bot - or any normal avatar for that matter - when it makes contact with the terrain or a non-phantom prim, this creates a collision. Yes, every footstep of your avatar creates a collision, which in turn induces a minute amount of lag.

NPC in OpenSim

A bot places roughly the same drain on a region's resources as a normal avatar, so if your region can handle 100 avatars, it could theoretically handle 100 bots instead.

What you have to look out for with bots, is their environment. Bots are usually used for display or interactive purposes. Sometimes they have a stand or are in a protective cage or enclosure so people can't bump into them or push them around.

Clever thinking but here's the problem. If that protective enclosure or stand is say a hollowed out cylinder or cube, that creates a lot of physical surfaces inside the prim that your bot can collide with.

If you use a bot enclosure, make sure that it is big enough so that the bot won't continually collide with it from inside.

In the case of Kiwi Island, I did find two bots with hollow enclosures that were too small. We remedied this, restarted the region, and held our breath. 15 minutes passed. Feeling optimistic, we tried terraforming. Bang. The region restart error flashed up again.

Part of the problem solved, but not all of it.

This extra piece of good advice regarding bots was offered by Aine Caoimhe: "Unless there is an urgent need for a NPC to be free-standing and moving around you should always rez and immediately seat it on something. If you want it to look like it's just standing there, have it 'sit' on an invisible prim sphere and play a standing animation and it will look perfect. The huge advantage of doing this is that bots (and avatars) are non-physical when seated so you completely remove their load/drain on the physics engine as soon as they're seated. In the case of NPCs which don't have all of the other activities needed for viewer-driven agents, this makes them then use almost no more resources than a prim."

Non-phantom objects .:. lag in OpenSim

This is such an obvious problem that it's scary how easy it is to overlook. And I did. Totally guilty of this idiot mistake.

Spinning physical object in OpenSim

After looking at all the techy backend hoo-haa for hours, non-phantom objects turned out to be the cause of 99% of Kiwi's memory leaks and subsequent crashes.

When finally it occurred to me, Leah and I went across the island parcel by parcel. We found a stationary submarine with a slow spinning prop. Non-phantom. Fixed it.

A night club with two big turbine fans on the walls, essentially non-phantom discs spinning inside a non-phantom hollow cylinder. Massive lag-makers. Fixed them.

On and on we went. Each change that we made brought about a slight improvement to the region's performance but we were still missing something big.

Purely by chance I reached a sim edge and noticed that I couldn't get any closer than 2 metres from the edge. Leah informed me that she had installed an invisible wall around the entire varregion to prevent vehicles from being driven off-world.

"Good idea", I thought, then pondered this for a moment. "Did the problems start around the time that you installed the walls?" I asked. "Yes", she said, "but they're just plain cube prims with no scripts, so I couldn't see them causing lag."

Neither could I for a moment, then I remembered all the lighthouses scattered around the perimeter of the island - My lighthouses by the way, ones that I had made on a previous grid and imported here as .oxp files.

Colliding lighthouse beams in OpenSim

I tried to recall if I had checked whether the light beams had imported as phantom, all the while suspecting they had not, since prim properties often change to solid by default upon import of an oxp file.

A quick inspection showed that the beams were not phantom, and therein lay the answer to the entire lag problem.

Eight light houses, each with three hollow cylinder beams 100 metres long were colliding with Leah's new boundary wall on a 20 second rotate cycle.

Every internal and external surface of those beams was making contact with the wall.

That equated to thousands of unnecessary collision events every minute, every hour, every day. No wonder the region was struggling with the load.

We ran about making all the lighthouse beams phantom.

Each one that we changed made a tangible difference to how fluidly our avatars moved and how quickly objects rezzed around us. One could literally feel the lag lifting from our shoulders :)

Kiwi is back to normal now and we haven't had a whisper of lag since.

I hope you gain something from our experience.

Xay Tomsen a.k.a. Andrew Thompson

comments powered by Disqus