Testing of Persistence Configurations Logbook.

Using ActiveMQ Store and MSG publisher/MSG consumer python scripts.

http://activemq.apache.org/amq-message-store.html

Test Scenarios:

-- Send messages in loop for 2 minutes. Pause 10 minutes. Send random few messages for 10 minutes, pause 10 minutes. Duration: 1 hour. Message Size: random 1,2,5k.

  • First observations:
    • A lot of messages were lost when running the stress cycle. - Actual problem was the limit for open file descriptors: Each script would open a new connection. Rate of closing the connections by the server was lower than the creation of new connections. After ulimit was increased, this problem was stabilized.
    • Persistence seems to work as long as the server processes the messages into the message store (no messages are lost do the connection failure previously described).
    • MessagesEncodedDecodedLongPersistence.png

  • Increasing the first loop to 10 minutes, a big degradation occurs with the increase on the number of open connections.
  • Sent messages 30051 - Received messages: 29449 Lost: ~2%
  • Error occured for a few messages: " ERROR RecoveryListenerAdapter - Message id ID:lxb6118.cern.ch-51583-1204887427969-4:52708:-1:1:1 could not be recovered from the data store! "
  • From ActiveMQ: https://issues.apache.org/activemq/browse/AMQ-1445 Fix on 5.1.0. Perhaps we should consider moving there :S

Test3 - using bulk instead.

Running 2 producers, 1 consumer, messages in bulks of 1000 x 1K, For 3hours. Loop: Send maximum messages for 1 hour, sleep 10 min, send few messages for 3min, sleep 5 min, repeat. producer2 kicks in 40 minutes after the first producer.

A few messages failed: The only traceable errors were

2008-03-11 18:23:00,139 [138.5.237:33191] ERROR Service                        - Async error occurred: java.lang.RuntimeException: org.apache.activemq.kaha.RuntimeStoreException: java.io.IOException: Could not locate data file data-topic-data-1 

2008-03-11 18:23:02,840 [138.5.237:33191] ERROR DataManagerImpl                - Looking for key 1 but not found in fileMap: {2=data-topic-data-2 number = 2 , length = 33554418 refCount = 7316, 3=data-topic-data-3 number = 3 , length = 4831686 refCount = 2322} 

2008-03-11 18:23:02,840 [138.5.237:33191] ERROR MapContainerImpl               - Failed to get value for offset=730779, key=(1, 3779446, 53), value=(1, 3779504, 69), previousItem=0, nextItem=-1 

2008-03-11 18:23:02,941 [138.5.237:33191] ERROR TopicStorePrefetch             - Failed to fill batch 

2008-03-11 18:23:02,941 [138.5.237:33191] ERROR Service                        - Async error occurred: java.lang.RuntimeException: org.apache.activemq.kaha.RuntimeStoreException: java.io.IOException: Could not locate data file data-topic-data-1 

2008-03-11 18:23:09,512 [42.131.89:33644] ERROR DataManagerImpl                - Looking for key 1 but not found in fileMap: {2=data-topic-data-2 number = 2 , length = 33554418 refCount = 7316, 3=data-topic-data-3 number = 3 , length = 4886960 refCount = 2300} 

2008-03-11 18:23:09,512 [42.131.89:33644] ERROR MapContainerImpl               - Failed to get value for offset=730779, key=(1, 3779446, 53), value=(1, 3779504, 69), previousItem=0, nextItem=-1 

2008-03-11 18:23:09,614 [42.131.89:33644] ERROR TopicStorePrefetch             - Failed to fill batch 

2008-03-11 18:23:09,617 [42.131.89:33644] ERROR StoreDurableSubscriberCursor   - Failed to get current cursor 

Already sent a message to activemq users mailing list to see if someone knows if it is an issue. I will try to reproduce it in the meantime. First messages lost on the producer Plxplus225.cern.ch-570 was {179945,179946}:

179944   Plxplus225.cern.ch-570   20.5527989   1205252069.339927   1205252089.8927259
179947   Plxplus225.cern.ch-570   20.5527989   1205252069.339927   1205252089.8927259
in total, 523037 messages were sent, 520206 received. (0,54% lost)

On producer Plxplus236-570 519037 were sent, 516946 received.(0,40% lost) First messages lost: {17543;17544}

17541   Plxplus236.cern.ch-570   20.5527989   1205252069.339927   1205252089.8927259
17542   Plxplus236.cern.ch-570   20.5527989   1205252069.339927   1205252089.8927259

ActiveMqStore_longRun3hours2ProducersII.png

Test4 : Using JDBC in addition to the activemq store

Awfully slow frown

JDBCPersistence_longRun7hours2Producers.png

Notes

http://www.sonicsoftware.com/products/sonicmq/performance_benchmarking/index.ssp

-- DanielRodrigues - 11 Mar 2008

Edit | Attach | Watch | Print version | History: r8 < r7 < r6 < r5 < r4 | Backlinks | Raw View | Raw edit | More topic actions...
Topic revision: r5 - 2008-03-13 - DanielRodrigues
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    LCG All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2021 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback