Testing of Master Slave Configurations Logbook.

SAN Master Slave Logbook

Using http://activemq.apache.org/amq-message-store.html

SAN configuration for lxb6118/6117: using remote machine lxmrrb3705. Folder /mnt/shareddfs/activemq available to be used while logged as user activemq.

Test 1:

a) Master receiving messages, Consumer On : ~10 min;

b) Stop Consumer, Stop Master;

c) Slave takes over, send few messages, send lots of messages: ~20min;

d) Slave stopped, Master re-takes control. Consumer Restart. : ~6min;

MasterSlaveChange.png

Important Problem using 5.1 already, but still traceable back: if we have SAN persistency + MasterSlave hanging on it, there's a problem associated to the recovery, has it will take a very long time: Tracking it on activemq usersforum. http://www.nabble.com/Persistence-From-Redo-Log-Only--td15869535s2354.html

JDBC Master Slave Logbook.

OracleDatabase connection configuratio, added to /etc/activemq/activemq.xml


  <bean id="oracle-ds" class="org.apache.commons.dbcp.BasicDataSource" destroy-method="close">
     <property name="driverClassName" value="oracle.jdbc.driver.OracleDriver"/>
     <property name="url" value="jdbc:oracle:thin:@int11r1-v.cern.ch:10121:int11r1"/>
     <property name="username" value="lcg_sam_messaging"/>
     <property name="password" value="<hidden>"/>
     <property name="poolPreparedStatements" value="true"/>
 </bean>

-- 25 Feb A long way for properly setting up Oracle: Number of sessions was limited for lcg_sam_messaging, causing a frequent number of sessions limit exceeded. lcg_sam_messaging_W was created with the following grants:

GRANT SELECT, INSERT, UPDATE, DELETE ON ACTIVEMQ_ACKS TO LCG_SAM_MESSAGING_W;                  
GRANT SELECT, INSERT, UPDATE, DELETE ON ACTIVEMQ_LOCK TO LCG_SAM_MESSAGING_W;                  
GRANT SELECT, INSERT, UPDATE, DELETE ON ACTIVEMQ_MSGS TO LCG_SAM_MESSAGING_W;  

Still this requires two things: 1. having the tables already configured; 2. creating synonyms on Lcg_sam_messaging_w for the given tables. Otherwise, it will fail with "No such table or view (will try to access the ones on it's own schema)"

SQL> CREATE SYNONYM ACTIVEMQ_ACKS for LCG_SAM_MESSAGING.ACTIVEMQ_ACKS
SQL> CREATE SYNONYM ACTIVEMQ_LOCK for LCG_SAM_MESSAGING.ACTIVEMQ_LOCK
SQL> CREATE SYNONYM ACTIVEMQ_MSGS for LCG_SAM_MESSAGING.ACTIVEMQ_MSGS

* test flow: initiated master :: initiated slave :: persistent subscription connected to master :: disconnect client :: send persistent messages to master :: shutdown master :: connect client to slave --> Client receives the messages persisted on master from slave! Back and forth, jumping from master to slave!

With HighPerformance Journal persistence.

Not possible stick out tongue From activemq documentation: ( Requires a shared database. Also relatively slow as it cannot use the high performance journal )

Pure Master Slave Logbook.

-- 22 Feb 2008

Configuration:
consumer publisher
consumerConfig
publisherconfig

Well, some strange combination of keys logged me out. Last notes were on the following: 1. ActiveMQ 5.0 seems not to recognize older configuration style: better using

<broker brokerName="slave" useJmx="false"  deleteAllMessagesOnStartup="true"  xmlns="http://activemq.org/config/1.0">
  <transportConnectors>
    <transportConnector uri="tcp://localhost:62002"/>
    <!-- Other Connectors -->
  </transportConnectors>
    <services>
    <masterConnector remoteURI= "tcp://lxb6118:62001"/>
  </services>

</broker>
2. Connectors seem to be available from start :S According to activemq documentation, they should be only started after slave takesover. Reading log, they seem to take over after crashing the master, but the connection is available since the slave is up. To be seen if the messages are processed or not in this case. 3. lxb6118 and 6117 are unavailable again... frown

-- 25 Feb 4. Connectors are started, and connections may be established but while the master is up, only the master will be responding to any request. 5. How about with durable subscriptions? a) After we crash the master, the slave is taking over correctly. b) Persistent messages are delivered to the slave after the master crashes:

  • test flow: initiated master :: initiated slave :: persistent subscription connected to master :: disconnect client :: send persistent messages to master :: shutdown master :: connect client to slave --> Client receives the messages persisted on master from slave!
c) Errors occur if data directories (/usr/share/activemq/data/{master|slave}) are not copied over from slave to master! (this is according to activemq documentation); d) When sending messages to a persistent topic, the following error is shown on the master log:
[root@lxb6118 data]# 2008-02-25 11:35:02,146 [141.72.90:33223] ERROR MasterBroker      
             - Slave Failed
javax.jms.JMSException: Slave broker out of sync with master: Acknowledgment (MessageAck {commandId = 14, responseRequired = true, ackType = 2, consumerId = ID:lxb6118.cern.ch-44642-1203935607862-4:0:-1:1, firstMessageId = null, lastMessageId = ID:lxb6118.cern.ch-44642-1203935607862-4:0:-1:1:2, destination = topic://test.MasterSlave, transactionId = null, messageCount = 1}) was not in the dispatch list: []
        at org.apache.activemq.broker.region.PrefetchSubscription.acknowledge(PrefetchSubscription.java:332)
        (...)
This seems not to affect the system. (Update: this error occurs when Persistent Subscription is not set to "Ack: client" ), but messages are not lost.

Conclusion: Pure Master Slave configuration functionally works. However, manual restarts are not desirable, therefore usage may be simply for state backup. TODO: performance testing.


Testing environment information

Available Boxes for Grid Messaging System testing:

Machines Typical usage Comments
lxb6118 Master Quattor installed
lxb6117 Slave Quattor installed
gridmsg001 not suitable for testing Being used by OSG, publishing into SAM database
gridmsg002 real use Cases testing Being used for Gridview dataTransfer records publishing + Dashboard + Condor.

Oracle connection:

* connection: int11r1-v.cern.ch:10121:int11r1 * username: lcg_sam_messaging * password: <>

Managing sessions is possible here: > https://oraweb.cern.ch/pls/int11r/webinstance.sessions.show_sessions

Additional users have been created, _R and _W, with more connections available. https://twiki.cern.ch/twiki/bin/view/PSSGroup/UserAccounts

-- DanielRodrigues - 22 Feb 2008

Edit | Attach | Watch | Print version | History: r8 < r7 < r6 < r5 < r4 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r8 - 2008-03-20 - DanielRodrigues
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    LCG All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback