Difference: TriggerSoftwareUpgradePublicResults (1 vs. 8)

Revision 82016-10-06 - AdemarDelgado

Line: 1 to 1
 
META TOPICPARENT name="TriggerPublicResults"
AtlasPublicTopicHeader.png
Line: 155 to 155
 
Added:
>
>

ATL-COM-DAQ-2016-133 Performance plots of HLT algorithms ported to GPU

Cluster Growing algorithm timing. Timing of the Calorimeter Cluster Growing phase of the CPU Topological Clustering (blue line) and the GPU Topological Automaton Cluster [TAC] (red dashed line) algorithms for the full detector. The TAC time includes the processing time and the overheads, data conversion and transfer. The execution time of the algorithms was measured using a data sample of QCD di-jet events with leading-jet transverse momentum above 20 GeV and a fixed number of 40 simultaneous interactions per bunch-crossing. The Topological Clustering runs on a single CPU core of an AMD FX-8320 processor (3.5~GHz) and the TAC runs in a GTX650 NVidia card.
png eps pdf
Cluster Growing algorithm timing. Timing of the Calorimeter Cluster Growing phase of the CPU Topological Clustering (blue line) and the GPU Topological Automaton Cluster [TAC] (red dashed line) algorithms for the full detector. The TAC time includes the processing time and the overheads, data conversion and transfer. The execution time of the algorithms was measured using a data sample of inclusive top quark pair production with 138 simultaneous interactions per bunch-crossing. The Topological Clustering runs on a single CPU core of an AMD FX-8320 processor (3.5~GHz) and the TAC runs in a GTX650 NVidia card.
png eps pdf
Timing of the GPU Topological Automaton Cluster [TAC] clusters conversion overhead (purple line) and Cluster Growing (green dashed line). The remaining 5~ms of the TAC total execution time is a constant overhead due to the cell data convertion, data transfer and Inter Process Communication (IPC). The execution time of the algorithms was measured using a data sample of QCD di-jet events with leading-jet transverse momentum above 20 GeV and a fixed number of 40 simultaneous interactions per bunch-crossing. The data conversion runs on a single CPU core of an AMD FX-8320 processor (3.5~GHz) processor and the Cluster Growing runs in a GTX650 NVidia card.
png eps pdf
Timing of the GPU Topological Automaton Cluster [TAC] clusters conversion overhead (purple line) and Cluster Growing (green dashed line). The remaining 5~ms of the TAC total execution time is a constant overhead due to the cell data convertion, data transfer and Inter Process Communication (IPC). The execution time of the algorithms was measured using a data sample of inclusive top quark pair production with 138 simultaneous interactions per bunch-crossing. The data conversion runs on a single CPU core of an AMD FX-8320 processor (3.5~GHz) processor and the Cluster Growing runs in a GTX650 NVidia card.
png eps pdf
Relative transverse energy difference of the matched calorimeter clusters reconstructed using the standard CPU cell clustering algorithms and the similar logical algorithm ported to GPU. These are raw clusters, before the execution of the cluster splitting algorithm. The algorithms differ in the way they group the less significant cells, in the CPU they belong to the first cluster that reaches them and in the GPU they will belong to the cluster with the most energetic seed, resulting in the difference observed in lower \Et{} clusters. The x-axis shows the CPU cluster transverse energy in GeV. The y-axis shows the corresponding transverse energy difference, CPU-GPU, divided by the CPU cluster transverse energy. Clusters are matched using the group of cluster seed cells, an unique cluster identifier that is invariant on the algorithm used. The data sample used consist of QCD di-jet events with leading-jet transverse momentum above 20 GeV and a fixed number of 40 simultaneous interactions per bunch-crossing. The Topological Clustering runs on a single CPU core of an AMD FX-8320 processor (3.5~GHz) processor and the TAC runs in a GTX650 NVidia card.
png eps pdf
 
<!-- ********************************************************* -->
<!-- Do NOT remove the remaining lines, but add requested info as appropriate-->
<!-- ********************************************************* -->
Line: 207 to 282
 
META FILEATTACHMENT attachment="HLT_pT_eff.png" attr="" comment="" date="1474644530" name="HLT_pT_eff.png" path="HLT_pT_eff.png" size="16998" user="demelian" version="1"
META FILEATTACHMENT attachment="HLT_pT_eff.eps" attr="" comment="" date="1474644530" name="HLT_pT_eff.eps" path="HLT_pT_eff.eps" size="10853" user="demelian" version="1"
META FILEATTACHMENT attachment="HLT_pT_eff.pdf" attr="" comment="" date="1474644530" name="HLT_pT_eff.pdf" path="HLT_pT_eff.pdf" size="15306" user="demelian" version="1"
Added:
>
>
META FILEATTACHMENT attachment="L1J50_ClusterMakerTime_jet.eps" attr="" comment="ClusterGrow Total Time" date="1475764059" name="L1J50_ClusterMakerTime_jet.eps" path="L1J50_ClusterMakerTime_jet.eps" size="9930" user="tavares" version="1"
META FILEATTACHMENT attachment="L1J50_ClusterMakerTime_jet.pdf" attr="" comment="ClusterGrow Total Time" date="1475764059" name="L1J50_ClusterMakerTime_jet.pdf" path="L1J50_ClusterMakerTime_jet.pdf" size="14665" user="tavares" version="1"
META FILEATTACHMENT attachment="L1J50_ClusterMakerTime_jet.png" attr="" comment="ClusterGrow Total Time" date="1475764059" name="L1J50_ClusterMakerTime_jet.png" path="L1J50_ClusterMakerTime_jet.png" size="17485" user="tavares" version="1"
META FILEATTACHMENT attachment="L1J50_ClusterMakerTime_ttbar.eps" attr="" comment="ClusterGrow Total Time" date="1475764059" name="L1J50_ClusterMakerTime_ttbar.eps" path="L1J50_ClusterMakerTime_ttbar.eps" size="10050" user="tavares" version="1"
META FILEATTACHMENT attachment="L1J50_ClusterMakerTime_ttbar.pdf" attr="" comment="ClusterGrow Total Time" date="1475764059" name="L1J50_ClusterMakerTime_ttbar.pdf" path="L1J50_ClusterMakerTime_ttbar.pdf" size="14697" user="tavares" version="1"
META FILEATTACHMENT attachment="L1J50_ClusterMakerTime_ttbar.png" attr="" comment="ClusterGrow Total Time" date="1475764059" name="L1J50_ClusterMakerTime_ttbar.png" path="L1J50_ClusterMakerTime_ttbar.png" size="17172" user="tavares" version="1"
META FILEATTACHMENT attachment="L1J50_GrowStepsTime_jet.eps" attr="" comment="ClusterGrow steps time" date="1475764318" name="L1J50_GrowStepsTime_jet.eps" path="L1J50_GrowStepsTime_jet.eps" size="10182" user="tavares" version="1"
META FILEATTACHMENT attachment="L1J50_GrowStepsTime_jet.pdf" attr="" comment="ClusterGrow steps time" date="1475764318" name="L1J50_GrowStepsTime_jet.pdf" path="L1J50_GrowStepsTime_jet.pdf" size="14702" user="tavares" version="1"
META FILEATTACHMENT attachment="L1J50_GrowStepsTime_jet.png" attr="" comment="ClusterGrow steps time" date="1475764318" name="L1J50_GrowStepsTime_jet.png" path="L1J50_GrowStepsTime_jet.png" size="18952" user="tavares" version="1"
META FILEATTACHMENT attachment="L1J50_GrowStepsTime_ttbar.eps" attr="" comment="ClusterGrow steps time" date="1475764318" name="L1J50_GrowStepsTime_ttbar.eps" path="L1J50_GrowStepsTime_ttbar.eps" size="10387" user="tavares" version="1"
META FILEATTACHMENT attachment="L1J50_GrowStepsTime_ttbar.pdf" attr="" comment="ClusterGrow steps time" date="1475764318" name="L1J50_GrowStepsTime_ttbar.pdf" path="L1J50_GrowStepsTime_ttbar.pdf" size="14724" user="tavares" version="1"
META FILEATTACHMENT attachment="L1J50_GrowStepsTime_ttbar.png" attr="" comment="ClusterGrow steps time" date="1475764318" name="L1J50_GrowStepsTime_ttbar.png" path="L1J50_GrowStepsTime_ttbar.png" size="18234" user="tavares" version="1"
META FILEATTACHMENT attachment="L1J50_deltaEtvsEt_jet.eps" attr="" comment="ClusterGrow Et CPU vs GPU" date="1475764318" name="L1J50_deltaEtvsEt_jet.eps" path="L1J50_deltaEtvsEt_jet.eps" size="30699" user="tavares" version="1"
META FILEATTACHMENT attachment="L1J50_deltaEtvsEt_jet.pdf" attr="" comment="ClusterGrow Et CPU vs GPU" date="1475764318" name="L1J50_deltaEtvsEt_jet.pdf" path="L1J50_deltaEtvsEt_jet.pdf" size="19012" user="tavares" version="1"
META FILEATTACHMENT attachment="L1J50_deltaEtvsEt_jet.png" attr="" comment="ClusterGrow Et CPU vs GPU" date="1475764318" name="L1J50_deltaEtvsEt_jet.png" path="L1J50_deltaEtvsEt_jet.png" size="22621" user="tavares" version="1"

Revision 72016-09-24 - JohnTMBaines

Line: 1 to 1
 
META TOPICPARENT name="TriggerPublicResults"
AtlasPublicTopicHeader.png
Line: 184 to 184
 
META FILEATTACHMENT attachment="speedupG2.pdf" attr="" comment="" date="1474550365" name="speedupG2.pdf" path="speedupG2.pdf" size="21694" user="baines" version="1"
META FILEATTACHMENT attachment="speedupG2.png" attr="" comment="" date="1474550365" name="speedupG2.png" path="speedupG2.png" size="21361" user="baines" version="1"
META FILEATTACHMENT attachment="CaloExecutiontimePiChart3.png" attr="" comment="" date="1474557154" name="CaloExecutiontimePiChart3.png" path="CaloExecutiontimePiChart3.png" size="138650" user="baines" version="1"
Changed:
<
<
META FILEATTACHMENT attachment="IDexecutiontimePiChart3.pdf" attr="" comment="" date="1474557154" name="IDexecutiontimePiChart3.pdf" path="IDexecutiontimePiChart3.pdf" size="17455" user="baines" version="1"
>
>
META FILEATTACHMENT attachment="IDexecutiontimePiChart3.pdf" attr="" comment="" date="1474737185" name="IDexecutiontimePiChart3.pdf" path="IDexecutiontimePiChart3.pdf" size="17472" user="baines" version="2"
 
META FILEATTACHMENT attachment="CaloExecutionTimePiChart1.pdf" attr="" comment="" date="1474557266" name="CaloExecutionTimePiChart1.pdf" path="CaloExecutionTimePiChart1.pdf" size="14793" user="baines" version="1"
META FILEATTACHMENT attachment="IDexecutiontimePiChart1.pdf" attr="" comment="" date="1474557154" name="IDexecutiontimePiChart1.pdf" path="IDexecutiontimePiChart1.pdf" size="15535" user="baines" version="1"
META FILEATTACHMENT attachment="CaloExecutionTimePiChart2.pdf" attr="" comment="" date="1474620821" name="CaloExecutionTimePiChart2.pdf" path="CaloExecutionTimePiChart2.pdf" size="16276" user="baines" version="2"
META FILEATTACHMENT attachment="IDexecutiontimePiChart2.pdf" attr="" comment="" date="1474557154" name="IDexecutiontimePiChart2.pdf" path="IDexecutiontimePiChart2.pdf" size="17020" user="baines" version="1"
META FILEATTACHMENT attachment="CaloExecutionTimePiChart3.pdf" attr="" comment="" date="1474557154" name="CaloExecutionTimePiChart3.pdf" path="CaloExecutionTimePiChart3.pdf" size="16899" user="baines" version="1"
Changed:
<
<
META FILEATTACHMENT attachment="IDexecutiontimePiChart3.png" attr="" comment="" date="1474557154" name="IDexecutiontimePiChart3.png" path="IDexecutiontimePiChart3.png" size="157522" user="baines" version="1"
>
>
META FILEATTACHMENT attachment="IDexecutiontimePiChart3.png" attr="" comment="" date="1474737151" name="IDexecutiontimePiChart3.png" path="IDexecutiontimePiChart3.png" size="115840" user="baines" version="2"
 
META FILEATTACHMENT attachment="CaloExecutionTimePiChart2.png" attr="" comment="" date="1474620821" name="CaloExecutionTimePiChart2.png" path="CaloExecutionTimePiChart2.png" size="116686" user="baines" version="2"
META FILEATTACHMENT attachment="IDexecutiontimePiChart2.png" attr="" comment="" date="1474557155" name="IDexecutiontimePiChart2.png" path="IDexecutiontimePiChart2.png" size="135808" user="baines" version="1"
META FILEATTACHMENT attachment="IDexecutiontimePiChart1.png" attr="" comment="" date="1474557266" name="IDexecutiontimePiChart1.png" path="IDexecutiontimePiChart1.png" size="48314" user="baines" version="1"

Revision 62016-09-23 - JohnTMBaines

Line: 1 to 1
 
META TOPICPARENT name="TriggerPublicResults"
AtlasPublicTopicHeader.png
Line: 98 to 98
 
Deleted:
<
<
<!-- Template for adding a new plot -->

<!-- Put Text Here -->

<!-- attach the files and then insert the filename name in the three places below -->


png pdf

 

Revision 52016-09-23 - DmitryEmeliyanov1

Line: 1 to 1
 
META TOPICPARENT name="TriggerPublicResults"
AtlasPublicTopicHeader.png
Line: 117 to 117
 
Added:
>
>

ATL-COM-DAQ-2016-119 Performance plots of HLT Inner Detector tracking algorithm implemented on GPU

Transverse impact parameter distributions for the simulated tracks correctly reconstructed by the GPU-accelerated tracking algorithm and the standard CPU-only algorithm. The reference CPU algorithm was FastTrackFinder consisting of track seed (spacepoint triplet) maker and combinatorial track following; the GPU algorithm was FastTrackFinder with GPU-accelerated track seed maker. The simulated tracks were required to have pT>1GeV and |eta|<2.5
png eps pdf
Transverse momentum distributions for the simulated tracks correctly reconstructed by the GPU-accelerated tracking algorithm and the standard CPU-only algorithm. The reference CPU algorithm was FastTrackFinder consisting of track seed (spacepoint triplet) maker and combinatorial track following; the GPU algorithm was FastTrackFinder with GPU-accelerated track seed maker. The simulated tracks were required to have pT>1GeV and |eta|<2.5
png eps pdf
Track reconstruction efficiency as a function of simulated track azimuth for the GPU-accelerated tracking algorithm and the standard CPU-only algorithm. The reference CPU algorithm was FastTrackFinder consisting of track seed (spacepoint triplet) maker and combinatorial track following; the GPU algorithm was FastTrackFinder with GPU-accelerated track seed maker. The simulated tracks were required to have pT>1GeV and |eta|<2.5
png eps pdf
 
Added:
>
>

Track reconstruction efficiency as a function of simulated track transverse momentum for the GPU-accelerated tracking algorithm and the standard CPU-only algorithm. The reference CPU algorithm was FastTrackFinder consisting of track seed (spacepoint triplet) maker and combinatorial track following; the GPU algorithm was FastTrackFinder with GPU-accelerated track seed maker. The simulated tracks were required to have pT>1GeV and |eta|<2.5
png eps pdf
 
<!-- ********************************************************* -->
<!-- Do NOT remove the remaining lines, but add requested info as appropriate-->
Line: 157 to 210
 
META FILEATTACHMENT attachment="IDexecutiontimePiChart2.png" attr="" comment="" date="1474557155" name="IDexecutiontimePiChart2.png" path="IDexecutiontimePiChart2.png" size="135808" user="baines" version="1"
META FILEATTACHMENT attachment="IDexecutiontimePiChart1.png" attr="" comment="" date="1474557266" name="IDexecutiontimePiChart1.png" path="IDexecutiontimePiChart1.png" size="48314" user="baines" version="1"
META FILEATTACHMENT attachment="CaloExecutionTimePiChart1.png" attr="" comment="" date="1474557703" name="CaloExecutionTimePiChart1.png" path="CaloExecutionTimePiChart1.png" size="102486" user="baines" version="1"
Added:
>
>
META FILEATTACHMENT attachment="HLT_a0.eps" attr="" comment="" date="1474643800" name="HLT_a0.eps" path="HLT_a0.eps" size="13712" user="demelian" version="1"
META FILEATTACHMENT attachment="HLT_a0.pdf" attr="" comment="" date="1474643800" name="HLT_a0.pdf" path="HLT_a0.pdf" size="16821" user="demelian" version="1"
META FILEATTACHMENT attachment="HLT_a0.png" attr="" comment="" date="1474643800" name="HLT_a0.png" path="HLT_a0.png" size="17822" user="demelian" version="1"
META FILEATTACHMENT attachment="HLT_pT.eps" attr="" comment="" date="1474644530" name="HLT_pT.eps" path="HLT_pT.eps" size="10345" user="demelian" version="1"
META FILEATTACHMENT attachment="HLT_pT.pdf" attr="" comment="" date="1474644530" name="HLT_pT.pdf" path="HLT_pT.pdf" size="15170" user="demelian" version="1"
META FILEATTACHMENT attachment="HLT_pT.png" attr="" comment="" date="1474644530" name="HLT_pT.png" path="HLT_pT.png" size="17063" user="demelian" version="1"
META FILEATTACHMENT attachment="HLT_phi_eff.png" attr="" comment="" date="1474644530" name="HLT_phi_eff.png" path="HLT_phi_eff.png" size="15715" user="demelian" version="1"
META FILEATTACHMENT attachment="HLT_phi_eff.eps" attr="" comment="" date="1474644530" name="HLT_phi_eff.eps" path="HLT_phi_eff.eps" size="10455" user="demelian" version="1"
META FILEATTACHMENT attachment="HLT_phi_eff.pdf" attr="" comment="" date="1474644530" name="HLT_phi_eff.pdf" path="HLT_phi_eff.pdf" size="14799" user="demelian" version="1"
META FILEATTACHMENT attachment="HLT_pT_eff.png" attr="" comment="" date="1474644530" name="HLT_pT_eff.png" path="HLT_pT_eff.png" size="16998" user="demelian" version="1"
META FILEATTACHMENT attachment="HLT_pT_eff.eps" attr="" comment="" date="1474644530" name="HLT_pT_eff.eps" path="HLT_pT_eff.eps" size="10853" user="demelian" version="1"
META FILEATTACHMENT attachment="HLT_pT_eff.pdf" attr="" comment="" date="1474644530" name="HLT_pT_eff.pdf" path="HLT_pT_eff.pdf" size="15306" user="demelian" version="1"

Revision 42016-09-23 - JohnTMBaines

Line: 1 to 1
 
META TOPICPARENT name="TriggerPublicResults"
AtlasPublicTopicHeader.png
Line: 21 to 21
 

Changed:
<
<
The ratio of event throughput rates with GPU acceleration to the CPU-only rates as a function of the number of Atlas trigger (Athena) processes running on the CPU. Separate tests were performed with Athena configured to execute only Inner Detector Tracking (ID), only Calorimeter topological clustering (Calo) or both (ID & Calo). The system was configured to either perform the work on the CPU or offload to one or two GPU. The system consisted of a two Intel(R) Xeon(R) E5-2695 v3 14-core CPU with a clock speed of 2.30GHz and two NVidia GK210GL GPU in a Tesla K80 module. The input was a simulated 𝑡𝑡̅ dataset converted to a raw detector output format (bytestream). An average of 46 minimum bias events per simulated collision were superimposed corresponding to instantaneous luminosity of 1.7x1034 cm-2s-1. The ID track seeding takes about 30% of event processing time on CPU and is accelerated by about a factor of 5 on GPU. As a result throughput increases by about 35% with GPU acceleration for up to 14 athena processes. The Calorimeter clustering algorithm takes about 8% of event processing time on CPU and accelerated by about a factor 2 on GPU, however the effect of the acceleration is offset by a small increase in the time of the non-accelerated code and as a result a small decrease in speed is observed with offloading to GPU.
>
>
The ratio of event throughput rates with GPU acceleration to the CPU-only rates as a function of the number of Atlas trigger (Athena) processes running on the CPU. Separate tests were performed with Athena configured to execute only Inner Detector Tracking (ID), only Calorimeter topological clustering (Calo) or both (ID & Calo). The system was configured to either perform the work on the CPU or offload to one or two GPU. The system consisted of a two Intel(R) Xeon(R) E5-2695 v3 14-core CPU with a clock speed of 2.30GHz and two NVidia GK210GL GPU in a Tesla K80 module. The input was a simulated 𝑡𝑡̅ dataset converted to a raw detector output format (bytestream). An average of 46 minimum bias events per simulated collision were superimposed corresponding to instantaneous luminosity of 1.7x1034 cm-2s-1. The ID track seeding takes about 30% of event processing time on CPU and is accelerated by about a factor of 5 on GPU. As a result throughput increases by about 35% with GPU acceleration for up to 14 athena processes. The Calorimeter clustering algorithm takes about 8% of event processing time on CPU and accelerated by about a factor 2 on GPU, however the effect of the acceleration is offset by a small increase in the time of the non-accelerated code and as a result a small decrease in speed is observed with offloading to GPU.
 

png
Line: 30 to 30
 

Changed:
<
<
Event throughput rates with and without GPU acceleration as a function of the number of Atlas trigger (Athena) processes running on the CPU. Separate tests were performed with Athena configured to execute only Inner Detector Tracking (ID), only Calorimeter topological clustering (Calo) or both (ID & Calo). The system was configured to either perform the work on the CPU or offload to one or two GPU. The system consisted of a two Intel(R) Xeon(R) E5-2695 v3 14-core CPU with a clock speed of 2.30GHz and two NVidia GK210GL GPU in a Tesla K80 module. The input was a simulated 𝑡𝑡̅ dataset converted to a raw detector output format (bytestream). An average of 46 minimum bias events per simulated collision were superimposed corresponding to instantaneous luminosity of 1.7x1034 cm-2s-1. A significant rate increase is seen when the ID track seeding is offloaded to GPU. The ID track seeding takes about 30% of event processing time on CPU and is accelerated by about a factor of 5 on GPU. A small rate decrease is observed when the calorimeter clustering is offloaded to GPU. The calorimeter clustering takes about 8% of event processing time on CPU and accelerated by about a factor 2 on GPU, however the effect of the acceleration is offset by a small increase in the time of the non-accelerated code. There is only a relatively small increase in rate when the number of Athena processes is increased above the number of physical cores (28).
>
>
Event throughput rates with and without GPU acceleration as a function of the number of Atlas trigger (Athena) processes running on the CPU. Separate tests were performed with Athena configured to execute only Inner Detector Tracking (ID), only Calorimeter topological clustering (Calo) or both (ID & Calo). The system was configured to either perform the work on the CPU or offload to one or two GPU. The system consisted of a two Intel(R) Xeon(R) E5-2695 v3 14-core CPU with a clock speed of 2.30GHz and two NVidia GK210GL GPU in a Tesla K80 module. The input was a simulated 𝑡𝑡̅ dataset converted to a raw detector output format (bytestream). An average of 46 minimum bias events per simulated collision were superimposed corresponding to instantaneous luminosity of 1.7x1034 cm-2s-1. A significant rate increase is seen when the ID track seeding is offloaded to GPU. The ID track seeding takes about 30% of event processing time on CPU and is accelerated by about a factor of 5 on GPU. A small rate decrease is observed when the calorimeter clustering is offloaded to GPU. The calorimeter clustering takes about 8% of event processing time on CPU and accelerated by about a factor 2 on GPU, however the effect of the acceleration is offset by a small increase in the time of the non-accelerated code. There is only a relatively small increase in rate when the number of Athena processes is increased above the number of physical cores (28).
 

png
Line: 40 to 40
 

Changed:
<
<
The time-averaged mean number of Atlas trigger (Athena) processes in a wait-state pending the return of work offloaded to the GPU as a function of the number of running on the CPU. Separate tests were performed with Athena configured to execute only Inner Detector Tracking (ID), only Calorimeter topological clustering (Calo) or both (ID & Calo). The system was configured to offload work to one or two GPUs. The system consisted of a two Intel(R) Xeon(R) E5-2695 v3 14-core CPU with a clock speed of 2.30GHz and two NVidia GK210GL GPU in a Tesla K80 module. The input was a simulated 𝑡𝑡̅ dataset converted to a raw detector output format (bytestream). An average of 46 minimum bias events per simulated collision were superimposed corresponding to instantaneous luminosity of 1.7x1034 cm-2s-1. When offloaded to GPU, the ID track seeding takes about 8% of the total event processing time and so the average number of Athena processes waiting is less than 1 for up to about 12 Athena processes. The offloaded calorimeter clustering takes about 4% of event processing time on CPU and so the average number of Athena processes waiting is less than 1 for up to about 25 Athena processes.
>
>
The time-averaged mean number of Atlas trigger (Athena) processes in a wait-state pending the return of work offloaded to the GPU as a function of the number of running on the CPU. Separate tests were performed with Athena configured to execute only Inner Detector Tracking (ID), only Calorimeter topological clustering (Calo) or both (ID & Calo). The system was configured to offload work to one or two GPUs. The system consisted of a two Intel(R) Xeon(R) E5-2695 v3 14-core CPU with a clock speed of 2.30GHz and two NVidia GK210GL GPU in a Tesla K80 module. The input was a simulated 𝑡𝑡̅ dataset converted to a raw detector output format (bytestream). An average of 46 minimum bias events per simulated collision were superimposed corresponding to instantaneous luminosity of 1.7x1034 cm-2s-1. When offloaded to GPU, the ID track seeding takes about 8% of the total event processing time and so the average number of Athena processes waiting is less than 1 for up to about 12 Athena processes. The offloaded calorimeter clustering takes about 4% of event processing time on CPU and so the average number of Athena processes waiting is less than 1 for up to about 25 Athena processes.
 

png
Line: 50 to 50
 

Added:
>
>
Breakdown of the time per event for Inner Detector Track Seeding offloaded to a GPU showing the time fraction for the kernels running on the GPU (GPU execution) and the overhead associated with offloading the work (other). Track Seeding consists of the formation of triplets of hits compatible with a track. The overhead comprises the time to convert data-structures between CPU and GPU data-formats, the data transfer time between CPU and GPU and the Inter Process Communication (IPC) time that accounts for the transfer of data between the Atlas Trigger (Athena) processes and the process handling communication with the GPU. The system consisted of a two Intel(R) Xeon(R) E5-2695 v3 14-core CPU with a clock speed of 2.30GHz and two NVidia GK210GL GPU in a Tesla K80 module. Measurements were made using one GPU and with 12 Athena processes running on the CPU. The input was a simulated 𝑡𝑡̅ dataset converted to a raw detector output format (bytestream). An average of 46 minimum bias events per simulated collision were superimposed corresponding to instantaneous luminosity of 1.7x1034 cm-2s-1.
 

png
Line: 57 to 58
 

Added:
>
>
Breakdown of the time per event for Inner Detector Tracking offloaded to a GPU showing the time fraction for the Counting, Doublet Making and Triplet Making kernels running on the GPU (GPU execution) and the overhead associated with offloading the work (other). The Counting kernel determines the number of pairs of Inner Detector hits and the Doublet and Triplet making kernels form combinations of 2 and 3 hits respectively compatible with a track. The overhead comprises the time to convert data-structures between CPU and GPU data-formats, the data transfer time between CPU and GPU and the Inter Process Communication (IPC) time that accounts for the transfer of data between the Atlas Trigger (Athena) processes and the process handling communication with the GPU. The system consisted of a two Intel(R) Xeon(R) E5-2695 v3 14-core CPU with a clock speed of 2.30GHz and two NVidia GK210GL GPU in a Tesla K80 module. Measurements were made using one GPU and with 12 Athena processes running on the CPU. The input was a simulated 𝑡𝑡̅ dataset converted to a raw detector output format (bytestream). An average of 46 minimum bias events per simulated collision were superimposed corresponding to instantaneous luminosity of 1.7x1034 cm-2s-1.
 

png
Line: 64 to 66
 

Added:
>
>
Breakdown of the time per event for the Atlas Trigger process (Athena) running Inner Detector (ID) Track Seeding on the CPU or offloaded to a GPU showing the time fraction for the Counting, Doublet Making and Triplet Making kernels running on the GPU (GPU execution) and the overhead associated with offloading the work (other). The Counting kernel determines the number of pairs of ID hits and the Doublet and Triplet making kernels form combinations of 2 and 3 hits respectively compatible with a track. The overhead comprises the time to convert data-structures between CPU and GPU data-formats, the data transfer time between CPU and GPU and the Inter Process Communication (IPC) time that accounts for the transfer of data between the Athena processes and the process handling communication with the GPU. The system consisted of a two Intel(R) Xeon(R) E5-2695 v3 14-core CPU with a clock speed of 2.30GHz and two NVidia GK210GL GPU in a Tesla K80 module. Measurements were made with one GPU and 12 Athena processes running on the CPU. Athena was configured to only run ID tracking. The input was a simulated 𝑡𝑡̅ dataset converted to a raw detector output format (bytestream). An average of 46 minimum bias events per simulated collision were superimposed corresponding to instantaneous luminosity of 1.7x1034 cm-2s-1.
 

png
Line: 71 to 74
 

Added:
>
>
Breakdown of the time per event for Calorimeter clustering offloaded to a GPU showing the time fraction for the kernels running on the GPU (GPU execution) and the overhead associated with offloading the work (other). The overhead comprises the time to convert data-structures between CPU and GPU data-formats, the data transfer time between CPU and GPU and the Inter Process Communication (IPC) time that accounts for the transfer of data between the Atlas Trigger (Athena) processes and the process handling communication with the GPU. The system consisted of a two Intel(R) Xeon(R) E5-2695 v3 14-core CPU with a clock speed of 2.30GHz and two NVidia GK210GL GPU in a Tesla K80 module. Measurements were made using one GPU and with 14 Athena processes running on the CPU. The input was a simulated 𝑡𝑡̅ dataset converted to a raw detector output format (bytestream). An average of 46 minimum bias events per simulated collision were superimposed corresponding to instantaneous luminosity of 1.7x1034 cm-2s-1.
 

png
Line: 78 to 82
 

Added:
>
>
Breakdown of the time per event for Calorimeter clustering offloaded to a GPU showing the time fraction for the Classification, Tagging and Growing kernels running on the GPU (GPU execution) and the overhead associated with offloading the work (other). The Classification kernel identifies calorimeter cells that will initiate (seed), propagate (grow), or terminate a cluster, the Tagging kernel assigns a unique tag to seed cells and the Growing kernel associates neighbouring growing or terminating cells to form clusters. The overhead comprises the time to convert data-structures between CPU and GPU data-formats, the data transfer time between CPU and GPU and the Inter Process Communication (IPC) time that accounts for the transfer of data between the Atlas Trigger (Athena) processes and the process handling communication with the GPU. The system consisted of a two Intel(R) Xeon(R) E5-2695 v3 14-core CPU with a clock speed of 2.30GHz and two NVidia GK210GL GPU in a Tesla K80 module. Measurements were made using one GPU and with 14 Athena processes running on the CPU. The input was a simulated 𝑡𝑡̅ dataset converted to a raw detector output format (bytestream). An average of 46 minimum bias events per simulated collision were superimposed corresponding to instantaneous luminosity of 1.7x1034 cm-2s-1.
 

png
Line: 85 to 90
 

Added:
>
>
Breakdown of the time per event for the Atlas Trigger process (Athena) running Calorimeter clustering on CPU and offloaded to a GPU showing the time for the Classification, Tagging and Clustering kernels running on the GPU (GPU execution) and the overhead associated with offloading the work (other). The Classification kernel identifies calorimeter cells that will initiate (seed), propagate (grow), or terminate a cluster, the Tagging kernel assigns a unique tag to seed cells and the Growing kernel associates neighbouring growing or terminating cells to form clusters. The overhead comprises the time to convert data-structures between CPU and GPU data-formats, the data transfer time between CPU and GPU and the Inter Process Communication (IPC) time that accounts for the transfer of data between the Atlas Trigger (Athena) processes and the process handling communication with the GPU. There is a small increase in the execution time of the non-accelerated code when the calorimeter clustering is offloaded to GPU. The system consisted of a two Intel(R) Xeon(R) E5-2695 v3 14-core CPU with a clock speed of 2.30GHz and two NVidia GK210GL GPU in a Tesla K80 module. Measurements were made using one GPU and with 14 Athena processes running on the CPU. Athena was configured to only run Calorimeter Clustering. The input was a simulated 𝑡𝑡̅ dataset converted to a raw detector output format (bytestream). An average of 46 minimum bias events per simulated collision were superimposed corresponding to instantaneous luminosity of 1.7x1034 cm-2s-1.
 

png pdf
Added:
>
>
<!-- Template for adding a new plot -->

<!-- Put Text Here -->

<!-- attach the files and then insert the filename name in the three places below -->


png pdf

 
Line: 126 to 149
 
META FILEATTACHMENT attachment="IDexecutiontimePiChart3.pdf" attr="" comment="" date="1474557154" name="IDexecutiontimePiChart3.pdf" path="IDexecutiontimePiChart3.pdf" size="17455" user="baines" version="1"
META FILEATTACHMENT attachment="CaloExecutionTimePiChart1.pdf" attr="" comment="" date="1474557266" name="CaloExecutionTimePiChart1.pdf" path="CaloExecutionTimePiChart1.pdf" size="14793" user="baines" version="1"
META FILEATTACHMENT attachment="IDexecutiontimePiChart1.pdf" attr="" comment="" date="1474557154" name="IDexecutiontimePiChart1.pdf" path="IDexecutiontimePiChart1.pdf" size="15535" user="baines" version="1"
Changed:
<
<
META FILEATTACHMENT attachment="CaloExecutionTimePiChart2.pdf" attr="" comment="" date="1474557154" name="CaloExecutionTimePiChart2.pdf" path="CaloExecutionTimePiChart2.pdf" size="16276" user="baines" version="1"
>
>
META FILEATTACHMENT attachment="CaloExecutionTimePiChart2.pdf" attr="" comment="" date="1474620821" name="CaloExecutionTimePiChart2.pdf" path="CaloExecutionTimePiChart2.pdf" size="16276" user="baines" version="2"
 
META FILEATTACHMENT attachment="IDexecutiontimePiChart2.pdf" attr="" comment="" date="1474557154" name="IDexecutiontimePiChart2.pdf" path="IDexecutiontimePiChart2.pdf" size="17020" user="baines" version="1"
META FILEATTACHMENT attachment="CaloExecutionTimePiChart3.pdf" attr="" comment="" date="1474557154" name="CaloExecutionTimePiChart3.pdf" path="CaloExecutionTimePiChart3.pdf" size="16899" user="baines" version="1"
META FILEATTACHMENT attachment="IDexecutiontimePiChart3.png" attr="" comment="" date="1474557154" name="IDexecutiontimePiChart3.png" path="IDexecutiontimePiChart3.png" size="157522" user="baines" version="1"
Changed:
<
<
META FILEATTACHMENT attachment="CaloExecutionTimePiChart2.png" attr="" comment="" date="1474557154" name="CaloExecutionTimePiChart2.png" path="CaloExecutionTimePiChart2.png" size="97875" user="baines" version="1"
>
>
META FILEATTACHMENT attachment="CaloExecutionTimePiChart2.png" attr="" comment="" date="1474620821" name="CaloExecutionTimePiChart2.png" path="CaloExecutionTimePiChart2.png" size="116686" user="baines" version="2"
 
META FILEATTACHMENT attachment="IDexecutiontimePiChart2.png" attr="" comment="" date="1474557155" name="IDexecutiontimePiChart2.png" path="IDexecutiontimePiChart2.png" size="135808" user="baines" version="1"
META FILEATTACHMENT attachment="IDexecutiontimePiChart1.png" attr="" comment="" date="1474557266" name="IDexecutiontimePiChart1.png" path="IDexecutiontimePiChart1.png" size="48314" user="baines" version="1"
META FILEATTACHMENT attachment="CaloExecutionTimePiChart1.png" attr="" comment="" date="1474557703" name="CaloExecutionTimePiChart1.png" path="CaloExecutionTimePiChart1.png" size="102486" user="baines" version="1"

Revision 32016-09-22 - JohnTMBaines

Line: 1 to 1
 
META TOPICPARENT name="TriggerPublicResults"
AtlasPublicTopicHeader.png
Line: 38 to 38
 pdf
Deleted:
<
<

png eps pdf
 
The time-averaged mean number of Atlas trigger (Athena) processes in a wait-state pending the return of work offloaded to the GPU as a function of the number of running on the CPU. Separate tests were performed with Athena configured to execute only Inner Detector Tracking (ID), only Calorimeter topological clustering (Calo) or both (ID & Calo). The system was configured to offload work to one or two GPUs. The system consisted of a two Intel(R) Xeon(R) E5-2695 v3 14-core CPU with a clock speed of 2.30GHz and two NVidia GK210GL GPU in a Tesla K80 module. The input was a simulated 𝑡𝑡̅ dataset converted to a raw detector output format (bytestream). An average of 46 minimum bias events per simulated collision were superimposed corresponding to instantaneous luminosity of 1.7x1034 cm-2s-1. When offloaded to GPU, the ID track seeding takes about 8% of the total event processing time and so the average number of Athena processes waiting is less than 1 for up to about 12 Athena processes. The offloaded calorimeter clustering takes about 4% of event processing time on CPU and so the average number of Athena processes waiting is less than 1 for up to about 25 Athena processes.
Line: 59 to 51
 
Changed:
<
<

png eps pdf
>
>

png pdf
 
Deleted:
<
<
 
Changed:
<
<

png eps pdf
>
>

png pdf
 
Deleted:
<
<
 
Changed:
<
<

png eps pdf
>
>

png pdf
 
Added:
>
>

png pdf
 
Changed:
<
<

png eps pdf
>
>

png pdf
 
Added:
>
>

png pdf
 
Line: 124 to 122
 
META FILEATTACHMENT attachment="speedupG2.eps" attr="" comment="" date="1474550365" name="speedupG2.eps" path="speedupG2.eps" size="14054" user="baines" version="1"
META FILEATTACHMENT attachment="speedupG2.pdf" attr="" comment="" date="1474550365" name="speedupG2.pdf" path="speedupG2.pdf" size="21694" user="baines" version="1"
META FILEATTACHMENT attachment="speedupG2.png" attr="" comment="" date="1474550365" name="speedupG2.png" path="speedupG2.png" size="21361" user="baines" version="1"
Added:
>
>
META FILEATTACHMENT attachment="CaloExecutiontimePiChart3.png" attr="" comment="" date="1474557154" name="CaloExecutiontimePiChart3.png" path="CaloExecutiontimePiChart3.png" size="138650" user="baines" version="1"
META FILEATTACHMENT attachment="IDexecutiontimePiChart3.pdf" attr="" comment="" date="1474557154" name="IDexecutiontimePiChart3.pdf" path="IDexecutiontimePiChart3.pdf" size="17455" user="baines" version="1"
META FILEATTACHMENT attachment="CaloExecutionTimePiChart1.pdf" attr="" comment="" date="1474557266" name="CaloExecutionTimePiChart1.pdf" path="CaloExecutionTimePiChart1.pdf" size="14793" user="baines" version="1"
META FILEATTACHMENT attachment="IDexecutiontimePiChart1.pdf" attr="" comment="" date="1474557154" name="IDexecutiontimePiChart1.pdf" path="IDexecutiontimePiChart1.pdf" size="15535" user="baines" version="1"
META FILEATTACHMENT attachment="CaloExecutionTimePiChart2.pdf" attr="" comment="" date="1474557154" name="CaloExecutionTimePiChart2.pdf" path="CaloExecutionTimePiChart2.pdf" size="16276" user="baines" version="1"
META FILEATTACHMENT attachment="IDexecutiontimePiChart2.pdf" attr="" comment="" date="1474557154" name="IDexecutiontimePiChart2.pdf" path="IDexecutiontimePiChart2.pdf" size="17020" user="baines" version="1"
META FILEATTACHMENT attachment="CaloExecutionTimePiChart3.pdf" attr="" comment="" date="1474557154" name="CaloExecutionTimePiChart3.pdf" path="CaloExecutionTimePiChart3.pdf" size="16899" user="baines" version="1"
META FILEATTACHMENT attachment="IDexecutiontimePiChart3.png" attr="" comment="" date="1474557154" name="IDexecutiontimePiChart3.png" path="IDexecutiontimePiChart3.png" size="157522" user="baines" version="1"
META FILEATTACHMENT attachment="CaloExecutionTimePiChart2.png" attr="" comment="" date="1474557154" name="CaloExecutionTimePiChart2.png" path="CaloExecutionTimePiChart2.png" size="97875" user="baines" version="1"
META FILEATTACHMENT attachment="IDexecutiontimePiChart2.png" attr="" comment="" date="1474557155" name="IDexecutiontimePiChart2.png" path="IDexecutiontimePiChart2.png" size="135808" user="baines" version="1"
META FILEATTACHMENT attachment="IDexecutiontimePiChart1.png" attr="" comment="" date="1474557266" name="IDexecutiontimePiChart1.png" path="IDexecutiontimePiChart1.png" size="48314" user="baines" version="1"
META FILEATTACHMENT attachment="CaloExecutionTimePiChart1.png" attr="" comment="" date="1474557703" name="CaloExecutionTimePiChart1.png" path="CaloExecutionTimePiChart1.png" size="102486" user="baines" version="1"

Revision 22016-09-22 - JohnTMBaines

Line: 1 to 1
 
META TOPICPARENT name="TriggerPublicResults"
AtlasPublicTopicHeader.png
Line: 7 to 7
 
Deleted:
<
<
 

Introduction

Changed:
<
<
Approved plots that can be shown by ATLAS speakers at conferences and similar events. Please do not add figures on your own. Contact the responsible project leader in case of questions and/or suggestions.
>
>
Approved plots that can be shown by ATLAS speakers at conferences and similar events. Please do not add figures on your own. Contact the responsible project leader in case of questions and/or suggestions. Follow the guidelines on the trigger public results page.

Phase-I Upgrade public plots

ATL-COM-DAQ-2016-116 ATLAS Trigger GPU Demonstrator Performance Plots

The ratio of event throughput rates with GPU acceleration to the CPU-only rates as a function of the number of Atlas trigger (Athena) processes running on the CPU. Separate tests were performed with Athena configured to execute only Inner Detector Tracking (ID), only Calorimeter topological clustering (Calo) or both (ID & Calo). The system was configured to either perform the work on the CPU or offload to one or two GPU. The system consisted of a two Intel(R) Xeon(R) E5-2695 v3 14-core CPU with a clock speed of 2.30GHz and two NVidia GK210GL GPU in a Tesla K80 module. The input was a simulated 𝑡𝑡̅ dataset converted to a raw detector output format (bytestream). An average of 46 minimum bias events per simulated collision were superimposed corresponding to instantaneous luminosity of 1.7x1034 cm-2s-1. The ID track seeding takes about 30% of event processing time on CPU and is accelerated by about a factor of 5 on GPU. As a result throughput increases by about 35% with GPU acceleration for up to 14 athena processes. The Calorimeter clustering algorithm takes about 8% of event processing time on CPU and accelerated by about a factor 2 on GPU, however the effect of the acceleration is offset by a small increase in the time of the non-accelerated code and as a result a small decrease in speed is observed with offloading to GPU.
png eps pdf
Event throughput rates with and without GPU acceleration as a function of the number of Atlas trigger (Athena) processes running on the CPU. Separate tests were performed with Athena configured to execute only Inner Detector Tracking (ID), only Calorimeter topological clustering (Calo) or both (ID & Calo). The system was configured to either perform the work on the CPU or offload to one or two GPU. The system consisted of a two Intel(R) Xeon(R) E5-2695 v3 14-core CPU with a clock speed of 2.30GHz and two NVidia GK210GL GPU in a Tesla K80 module. The input was a simulated 𝑡𝑡̅ dataset converted to a raw detector output format (bytestream). An average of 46 minimum bias events per simulated collision were superimposed corresponding to instantaneous luminosity of 1.7x1034 cm-2s-1. A significant rate increase is seen when the ID track seeding is offloaded to GPU. The ID track seeding takes about 30% of event processing time on CPU and is accelerated by about a factor of 5 on GPU. A small rate decrease is observed when the calorimeter clustering is offloaded to GPU. The calorimeter clustering takes about 8% of event processing time on CPU and accelerated by about a factor 2 on GPU, however the effect of the acceleration is offset by a small increase in the time of the non-accelerated code. There is only a relatively small increase in rate when the number of Athena processes is increased above the number of physical cores (28).
png eps pdf

png eps pdf
The time-averaged mean number of Atlas trigger (Athena) processes in a wait-state pending the return of work offloaded to the GPU as a function of the number of running on the CPU. Separate tests were performed with Athena configured to execute only Inner Detector Tracking (ID), only Calorimeter topological clustering (Calo) or both (ID & Calo). The system was configured to offload work to one or two GPUs. The system consisted of a two Intel(R) Xeon(R) E5-2695 v3 14-core CPU with a clock speed of 2.30GHz and two NVidia GK210GL GPU in a Tesla K80 module. The input was a simulated 𝑡𝑡̅ dataset converted to a raw detector output format (bytestream). An average of 46 minimum bias events per simulated collision were superimposed corresponding to instantaneous luminosity of 1.7x1034 cm-2s-1. When offloaded to GPU, the ID track seeding takes about 8% of the total event processing time and so the average number of Athena processes waiting is less than 1 for up to about 12 Athena processes. The offloaded calorimeter clustering takes about 4% of event processing time on CPU and so the average number of Athena processes waiting is less than 1 for up to about 25 Athena processes.
png eps pdf

png eps pdf

png eps pdf

png eps pdf

png eps pdf
 
Line: 31 to 114
 
<!-- Once this page has been reviewed, please add the name and the date e.g. StephenHaywood - 31 Oct 2006 -->
Added:
>
>
META FILEATTACHMENT attachment="occupancyG2.eps" attr="" comment="" date="1474550365" name="occupancyG2.eps" path="occupancyG2.eps" size="9710" user="baines" version="1"
META FILEATTACHMENT attachment="occupancyG2.pdf" attr="" comment="" date="1474550365" name="occupancyG2.pdf" path="occupancyG2.pdf" size="19395" user="baines" version="1"
META FILEATTACHMENT attachment="occupancyG2.png" attr="" comment="" date="1474550365" name="occupancyG2.png" path="occupancyG2.png" size="17249" user="baines" version="1"
META FILEATTACHMENT attachment="rateG2.eps" attr="" comment="" date="1474550365" name="rateG2.eps" path="rateG2.eps" size="12620" user="baines" version="1"
META FILEATTACHMENT attachment="rateG2.pdf" attr="" comment="" date="1474550365" name="rateG2.pdf" path="rateG2.pdf" size="22315" user="baines" version="1"
META FILEATTACHMENT attachment="rateG2.png" attr="" comment="" date="1474550365" name="rateG2.png" path="rateG2.png" size="21007" user="baines" version="1"
META FILEATTACHMENT attachment="speedupG2.eps" attr="" comment="" date="1474550365" name="speedupG2.eps" path="speedupG2.eps" size="14054" user="baines" version="1"
META FILEATTACHMENT attachment="speedupG2.pdf" attr="" comment="" date="1474550365" name="speedupG2.pdf" path="speedupG2.pdf" size="21694" user="baines" version="1"
META FILEATTACHMENT attachment="speedupG2.png" attr="" comment="" date="1474550365" name="speedupG2.png" path="speedupG2.png" size="21361" user="baines" version="1"

Revision 12016-09-22 - AnnaSfyrla

Line: 1 to 1
Added:
>
>
META TOPICPARENT name="TriggerPublicResults"
AtlasPublicTopicHeader.png

Trigger Software Upgrade Public Results

Introduction

Approved plots that can be shown by ATLAS speakers at conferences and similar events. Please do not add figures on your own. Contact the responsible project leader in case of questions and/or suggestions.

<!-- ********************************************************* -->
<!-- Do NOT remove the remaining lines, but add requested info as appropriate-->
<!-- ********************************************************* -->


<!-- For significant updates to the topic, consider adding your 'signature' (beneath this editing box) -->

<!-- Person responsible for the page: 
Either leave as is - the creator's name will be inserted; 
Or replace the complete REVINFO tag (including percentages symbols) with a name in the form TwikiUsersName -->
Responsible: JohnBaines, TomaszBold
Subject: public
<!-- Once this page has been reviewed, please add the name and the date e.g. StephenHaywood - 31 Oct 2006 -->
 
This site is powered by the TWiki collaboration platform Powered by PerlCopyright & 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback