Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
programming:c-cpp-performance [2009/12/22 12:36]
cyril vectorialization
programming:c-cpp-performance [2013/09/19 16:40] (current)
Line 121: Line 121:
 ==== Other profilers ==== ==== Other profilers ====
  
-sysprof and oprofile are other profiles which use a kernel module. Sysprof has even a GUI, and shows the time spent in all running functions and programs.+sysprof and oprofile are other profilers which use a kernel module. Sysprof has even a GUI, and shows the time spent in all running functions and programs (system wide). If you want details about scheduling of the processes (which process was running when on which cpu), you can use trace-cmd and its front-end KernelShark.
  
 ==== Traps ==== ==== Traps ====
Line 226: Line 226:
  
 More simple names are available in GCC headers: More simple names are available in GCC headers:
-  * SSE for float (<emmintrin.h>): _mm_add_ps, _mm_sub_ps, _mm_mul_ps, _mm_div_ps, _mm_sqrt_ps, _mm_rsqrt_ps, _mm_rcp_ps, _mm_load_ps, _mm_store_ps, _mm_set1_ps, _mm_setr_ps +  * SSE for float (<emmintrin.h>, operand type _ _mm128): _mm_add_ps, _mm_sub_ps, _mm_mul_ps, _mm_div_ps, _mm_sqrt_ps, _mm_rsqrt_ps, _mm_rcp_ps, _mm_load_ps, _mm_store_ps, _mm_set1_ps, _mm_setr_ps 
-  * SSE2 for int (<xmmintrin.h>): _mm_add_epi32, _mm_sub_epi32, _mm_set1_epi32, _mm_setr_epi32+  * SSE2 for int (<xmmintrin.h>, operand type _ _mm128i): _mm_add_epi32, _mm_sub_epi32, _mm_set1_epi32, _mm_setr_epi32 
 +  * SSE2 int/float conversions: _ _builtin_ia32_cvtdq2ps, _ _builtin_ia32_cvtps2dq
  
 And you have to compile with GCC flags -msse and -msse2, or one -march that supports it. And you have to compile with GCC flags -msse and -msse2, or one -march that supports it.
Line 239: Line 240:
 #include <sys/time.h> #include <sys/time.h>
 struct timeval tv; struct timeval tv;
-struct timezone tz; +gettimeofday(&tv,NULL);
-gettimeofday(&tv,&tz);+
  
 unsigned microseconds = tv.tv_sec*1000000 + tv.tv_usec; // beware overflows unsigned microseconds = tv.tv_sec*1000000 + tv.tv_usec; // beware overflows
programming/c-cpp-performance.1261485415.txt.gz · Last modified: 2013/09/19 16:43 (external edit)
CC Attribution-Share Alike 4.0 International
Driven by DokuWiki Recent changes RSS feed Valid CSS Valid XHTML 1.0