Is the amount of NUMA nodes always equal to sockets?












13















I have used lscpu to check two servers configuration:



[root@localhost ~]# lscpu
Architecture: x86_64
......
Core(s) per socket: 1
Socket(s): 1
NUMA node(s): 1
Vendor ID: GenuineIntel
CPU family: 6
Model: 26


The other:



[root@localhost Packages]# lscpu
Architecture: x86_64
.....
Socket(s): 2
NUMA node(s): 2
Vendor ID: GenuineIntel
CPU family: 6
Model: 45


So I am wondering whether the amount of NUMA nodes always equal to sockets in fact. Is there any example where they are not equal?










share|improve this question





























    13















    I have used lscpu to check two servers configuration:



    [root@localhost ~]# lscpu
    Architecture: x86_64
    ......
    Core(s) per socket: 1
    Socket(s): 1
    NUMA node(s): 1
    Vendor ID: GenuineIntel
    CPU family: 6
    Model: 26


    The other:



    [root@localhost Packages]# lscpu
    Architecture: x86_64
    .....
    Socket(s): 2
    NUMA node(s): 2
    Vendor ID: GenuineIntel
    CPU family: 6
    Model: 45


    So I am wondering whether the amount of NUMA nodes always equal to sockets in fact. Is there any example where they are not equal?










    share|improve this question



























      13












      13








      13


      10






      I have used lscpu to check two servers configuration:



      [root@localhost ~]# lscpu
      Architecture: x86_64
      ......
      Core(s) per socket: 1
      Socket(s): 1
      NUMA node(s): 1
      Vendor ID: GenuineIntel
      CPU family: 6
      Model: 26


      The other:



      [root@localhost Packages]# lscpu
      Architecture: x86_64
      .....
      Socket(s): 2
      NUMA node(s): 2
      Vendor ID: GenuineIntel
      CPU family: 6
      Model: 45


      So I am wondering whether the amount of NUMA nodes always equal to sockets in fact. Is there any example where they are not equal?










      share|improve this question
















      I have used lscpu to check two servers configuration:



      [root@localhost ~]# lscpu
      Architecture: x86_64
      ......
      Core(s) per socket: 1
      Socket(s): 1
      NUMA node(s): 1
      Vendor ID: GenuineIntel
      CPU family: 6
      Model: 26


      The other:



      [root@localhost Packages]# lscpu
      Architecture: x86_64
      .....
      Socket(s): 2
      NUMA node(s): 2
      Vendor ID: GenuineIntel
      CPU family: 6
      Model: 45


      So I am wondering whether the amount of NUMA nodes always equal to sockets in fact. Is there any example where they are not equal?







      cpu cpu-architecture numa






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited May 19 '15 at 6:14









      Mureinik

      2,66571625




      2,66571625










      asked May 19 '15 at 6:03









      Nan XiaoNan Xiao

      1,64041320




      1,64041320






















          2 Answers
          2






          active

          oldest

          votes


















          17














          Why are you wondering about number of NUMA nodes? The important part is NUMA topology, which says how are those "nodes" connected.



          I have checked few systems including 8-socket (10-core CPUs) system consisting of 4 interconnected 2-socket blades (Hitachi Compute Node 2000). Also here the number of NUMA nodes is equal to number of CPU sockets (8). This depends on the CPU architecture, mainly its memory bus design.



          The whole NUMA (non-uniform memory access) defines how can each logical CPU access each part of memory. When you have 2 socket system, each CPU (socket) has its own memory, which it can directly access. But it must also be able to access memory in the other socket - and this of course takes more CPU cycles than accessing local memory. NUMA nodes specifies which part of system memory is local to which CPU. You can have more layers of topology, for example in case of HP Superdome system (which uses Intel Itanium2 CPUs), you have local CPU socket memory, then memory on different socket inside the same cell and then memory in other cells (which have the highest latency).



          You can configure the NUMA in your system to behave such as to give the best possible performance for your workload. You can for example allow all CPUs to access all memory, or to only access local memory, which then changes how the linux scheduler will distribute processes among the available logical CPUs. If you have many processes requiring not much memory, using only local memory can be benefit, but if you have large processes (Oracle database with its shared memory), using all memory among all cpus might be better.



          You can use commands such as numastat or numactl --hardware to check NUMA status on your system. Here is info from that 8-socket machine:



          hana2:~ # lscpu
          Architecture: x86_64
          CPU(s): 160
          Thread(s) per core: 2
          Core(s) per socket: 10
          CPU socket(s): 8
          NUMA node(s): 8
          NUMA node0 CPU(s): 0-19
          NUMA node1 CPU(s): 20-39
          NUMA node2 CPU(s): 40-59
          NUMA node3 CPU(s): 60-79
          NUMA node4 CPU(s): 80-99
          NUMA node5 CPU(s): 100-119
          NUMA node6 CPU(s): 120-139
          NUMA node7 CPU(s): 140-159

          hana2:~ # numactl --hardware
          available: 8 nodes (0-7)
          node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
          node 0 size: 130961 MB
          node 0 free: 66647 MB
          node 1 cpus: 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39
          node 1 size: 131072 MB
          node 1 free: 38705 MB
          node 2 cpus: 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59
          node 2 size: 131072 MB
          node 2 free: 71668 MB
          node 3 cpus: 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79
          node 3 size: 131072 MB
          node 3 free: 47432 MB
          node 4 cpus: 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99
          node 4 size: 131072 MB
          node 4 free: 68458 MB
          node 5 cpus: 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119
          node 5 size: 131072 MB
          node 5 free: 62218 MB
          node 6 cpus: 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139
          node 6 size: 131072 MB
          node 6 free: 68071 MB
          node 7 cpus: 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159
          node 7 size: 131008 MB
          node 7 free: 47306 MB
          node distances:
          node 0 1 2 3 4 5 6 7
          0: 10 21 21 21 21 21 21 21
          1: 21 10 21 21 21 21 21 21
          2: 21 21 10 21 21 21 21 21
          3: 21 21 21 10 21 21 21 21
          4: 21 21 21 21 10 21 21 21
          5: 21 21 21 21 21 10 21 21
          6: 21 21 21 21 21 21 10 21
          7: 21 21 21 21 21 21 21 10


          There you can see the amount of memory present in each NUMA node (CPU socket) and how much of it is used and free.



          The last section shows the NUMA topology - it shows the "distances" between individual nodes in terms of memory access latencies (the numbers are relative only, they don't represent time in ms or anything). Here you can see the latency to local memory (node 0 accessing memory in 0, node 1 in 1, ...) is 10 while remote latency (node accessing memory on other node) is 21. Although this system is consisting of 4 individual blades, the latency is the same for different socket on the same blade or other blade.



          Interesting document about NUMA is also at RedHat portal.






          share|improve this answer































            2














            No. The number of NUMA nodes does not always equal the number of sockets. For example, an AMD Threadripper 1950X has 1 socket and 2 NUMA nodes while a dual Intel Xeon E5310 system can show 2 sockets and 1 NUMA node.






            share|improve this answer























              Your Answer








              StackExchange.ready(function() {
              var channelOptions = {
              tags: "".split(" "),
              id: "3"
              };
              initTagRenderer("".split(" "), "".split(" "), channelOptions);

              StackExchange.using("externalEditor", function() {
              // Have to fire editor after snippets, if snippets enabled
              if (StackExchange.settings.snippets.snippetsEnabled) {
              StackExchange.using("snippets", function() {
              createEditor();
              });
              }
              else {
              createEditor();
              }
              });

              function createEditor() {
              StackExchange.prepareEditor({
              heartbeatType: 'answer',
              autoActivateHeartbeat: false,
              convertImagesToLinks: true,
              noModals: true,
              showLowRepImageUploadWarning: true,
              reputationToPostImages: 10,
              bindNavPrevention: true,
              postfix: "",
              imageUploader: {
              brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
              contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
              allowUrls: true
              },
              onDemand: true,
              discardSelector: ".discard-answer"
              ,immediatelyShowMarkdownHelp:true
              });


              }
              });














              draft saved

              draft discarded


















              StackExchange.ready(
              function () {
              StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fsuperuser.com%2fquestions%2f916516%2fis-the-amount-of-numa-nodes-always-equal-to-sockets%23new-answer', 'question_page');
              }
              );

              Post as a guest















              Required, but never shown

























              2 Answers
              2






              active

              oldest

              votes








              2 Answers
              2






              active

              oldest

              votes









              active

              oldest

              votes






              active

              oldest

              votes









              17














              Why are you wondering about number of NUMA nodes? The important part is NUMA topology, which says how are those "nodes" connected.



              I have checked few systems including 8-socket (10-core CPUs) system consisting of 4 interconnected 2-socket blades (Hitachi Compute Node 2000). Also here the number of NUMA nodes is equal to number of CPU sockets (8). This depends on the CPU architecture, mainly its memory bus design.



              The whole NUMA (non-uniform memory access) defines how can each logical CPU access each part of memory. When you have 2 socket system, each CPU (socket) has its own memory, which it can directly access. But it must also be able to access memory in the other socket - and this of course takes more CPU cycles than accessing local memory. NUMA nodes specifies which part of system memory is local to which CPU. You can have more layers of topology, for example in case of HP Superdome system (which uses Intel Itanium2 CPUs), you have local CPU socket memory, then memory on different socket inside the same cell and then memory in other cells (which have the highest latency).



              You can configure the NUMA in your system to behave such as to give the best possible performance for your workload. You can for example allow all CPUs to access all memory, or to only access local memory, which then changes how the linux scheduler will distribute processes among the available logical CPUs. If you have many processes requiring not much memory, using only local memory can be benefit, but if you have large processes (Oracle database with its shared memory), using all memory among all cpus might be better.



              You can use commands such as numastat or numactl --hardware to check NUMA status on your system. Here is info from that 8-socket machine:



              hana2:~ # lscpu
              Architecture: x86_64
              CPU(s): 160
              Thread(s) per core: 2
              Core(s) per socket: 10
              CPU socket(s): 8
              NUMA node(s): 8
              NUMA node0 CPU(s): 0-19
              NUMA node1 CPU(s): 20-39
              NUMA node2 CPU(s): 40-59
              NUMA node3 CPU(s): 60-79
              NUMA node4 CPU(s): 80-99
              NUMA node5 CPU(s): 100-119
              NUMA node6 CPU(s): 120-139
              NUMA node7 CPU(s): 140-159

              hana2:~ # numactl --hardware
              available: 8 nodes (0-7)
              node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
              node 0 size: 130961 MB
              node 0 free: 66647 MB
              node 1 cpus: 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39
              node 1 size: 131072 MB
              node 1 free: 38705 MB
              node 2 cpus: 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59
              node 2 size: 131072 MB
              node 2 free: 71668 MB
              node 3 cpus: 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79
              node 3 size: 131072 MB
              node 3 free: 47432 MB
              node 4 cpus: 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99
              node 4 size: 131072 MB
              node 4 free: 68458 MB
              node 5 cpus: 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119
              node 5 size: 131072 MB
              node 5 free: 62218 MB
              node 6 cpus: 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139
              node 6 size: 131072 MB
              node 6 free: 68071 MB
              node 7 cpus: 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159
              node 7 size: 131008 MB
              node 7 free: 47306 MB
              node distances:
              node 0 1 2 3 4 5 6 7
              0: 10 21 21 21 21 21 21 21
              1: 21 10 21 21 21 21 21 21
              2: 21 21 10 21 21 21 21 21
              3: 21 21 21 10 21 21 21 21
              4: 21 21 21 21 10 21 21 21
              5: 21 21 21 21 21 10 21 21
              6: 21 21 21 21 21 21 10 21
              7: 21 21 21 21 21 21 21 10


              There you can see the amount of memory present in each NUMA node (CPU socket) and how much of it is used and free.



              The last section shows the NUMA topology - it shows the "distances" between individual nodes in terms of memory access latencies (the numbers are relative only, they don't represent time in ms or anything). Here you can see the latency to local memory (node 0 accessing memory in 0, node 1 in 1, ...) is 10 while remote latency (node accessing memory on other node) is 21. Although this system is consisting of 4 individual blades, the latency is the same for different socket on the same blade or other blade.



              Interesting document about NUMA is also at RedHat portal.






              share|improve this answer




























                17














                Why are you wondering about number of NUMA nodes? The important part is NUMA topology, which says how are those "nodes" connected.



                I have checked few systems including 8-socket (10-core CPUs) system consisting of 4 interconnected 2-socket blades (Hitachi Compute Node 2000). Also here the number of NUMA nodes is equal to number of CPU sockets (8). This depends on the CPU architecture, mainly its memory bus design.



                The whole NUMA (non-uniform memory access) defines how can each logical CPU access each part of memory. When you have 2 socket system, each CPU (socket) has its own memory, which it can directly access. But it must also be able to access memory in the other socket - and this of course takes more CPU cycles than accessing local memory. NUMA nodes specifies which part of system memory is local to which CPU. You can have more layers of topology, for example in case of HP Superdome system (which uses Intel Itanium2 CPUs), you have local CPU socket memory, then memory on different socket inside the same cell and then memory in other cells (which have the highest latency).



                You can configure the NUMA in your system to behave such as to give the best possible performance for your workload. You can for example allow all CPUs to access all memory, or to only access local memory, which then changes how the linux scheduler will distribute processes among the available logical CPUs. If you have many processes requiring not much memory, using only local memory can be benefit, but if you have large processes (Oracle database with its shared memory), using all memory among all cpus might be better.



                You can use commands such as numastat or numactl --hardware to check NUMA status on your system. Here is info from that 8-socket machine:



                hana2:~ # lscpu
                Architecture: x86_64
                CPU(s): 160
                Thread(s) per core: 2
                Core(s) per socket: 10
                CPU socket(s): 8
                NUMA node(s): 8
                NUMA node0 CPU(s): 0-19
                NUMA node1 CPU(s): 20-39
                NUMA node2 CPU(s): 40-59
                NUMA node3 CPU(s): 60-79
                NUMA node4 CPU(s): 80-99
                NUMA node5 CPU(s): 100-119
                NUMA node6 CPU(s): 120-139
                NUMA node7 CPU(s): 140-159

                hana2:~ # numactl --hardware
                available: 8 nodes (0-7)
                node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
                node 0 size: 130961 MB
                node 0 free: 66647 MB
                node 1 cpus: 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39
                node 1 size: 131072 MB
                node 1 free: 38705 MB
                node 2 cpus: 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59
                node 2 size: 131072 MB
                node 2 free: 71668 MB
                node 3 cpus: 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79
                node 3 size: 131072 MB
                node 3 free: 47432 MB
                node 4 cpus: 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99
                node 4 size: 131072 MB
                node 4 free: 68458 MB
                node 5 cpus: 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119
                node 5 size: 131072 MB
                node 5 free: 62218 MB
                node 6 cpus: 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139
                node 6 size: 131072 MB
                node 6 free: 68071 MB
                node 7 cpus: 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159
                node 7 size: 131008 MB
                node 7 free: 47306 MB
                node distances:
                node 0 1 2 3 4 5 6 7
                0: 10 21 21 21 21 21 21 21
                1: 21 10 21 21 21 21 21 21
                2: 21 21 10 21 21 21 21 21
                3: 21 21 21 10 21 21 21 21
                4: 21 21 21 21 10 21 21 21
                5: 21 21 21 21 21 10 21 21
                6: 21 21 21 21 21 21 10 21
                7: 21 21 21 21 21 21 21 10


                There you can see the amount of memory present in each NUMA node (CPU socket) and how much of it is used and free.



                The last section shows the NUMA topology - it shows the "distances" between individual nodes in terms of memory access latencies (the numbers are relative only, they don't represent time in ms or anything). Here you can see the latency to local memory (node 0 accessing memory in 0, node 1 in 1, ...) is 10 while remote latency (node accessing memory on other node) is 21. Although this system is consisting of 4 individual blades, the latency is the same for different socket on the same blade or other blade.



                Interesting document about NUMA is also at RedHat portal.






                share|improve this answer


























                  17












                  17








                  17







                  Why are you wondering about number of NUMA nodes? The important part is NUMA topology, which says how are those "nodes" connected.



                  I have checked few systems including 8-socket (10-core CPUs) system consisting of 4 interconnected 2-socket blades (Hitachi Compute Node 2000). Also here the number of NUMA nodes is equal to number of CPU sockets (8). This depends on the CPU architecture, mainly its memory bus design.



                  The whole NUMA (non-uniform memory access) defines how can each logical CPU access each part of memory. When you have 2 socket system, each CPU (socket) has its own memory, which it can directly access. But it must also be able to access memory in the other socket - and this of course takes more CPU cycles than accessing local memory. NUMA nodes specifies which part of system memory is local to which CPU. You can have more layers of topology, for example in case of HP Superdome system (which uses Intel Itanium2 CPUs), you have local CPU socket memory, then memory on different socket inside the same cell and then memory in other cells (which have the highest latency).



                  You can configure the NUMA in your system to behave such as to give the best possible performance for your workload. You can for example allow all CPUs to access all memory, or to only access local memory, which then changes how the linux scheduler will distribute processes among the available logical CPUs. If you have many processes requiring not much memory, using only local memory can be benefit, but if you have large processes (Oracle database with its shared memory), using all memory among all cpus might be better.



                  You can use commands such as numastat or numactl --hardware to check NUMA status on your system. Here is info from that 8-socket machine:



                  hana2:~ # lscpu
                  Architecture: x86_64
                  CPU(s): 160
                  Thread(s) per core: 2
                  Core(s) per socket: 10
                  CPU socket(s): 8
                  NUMA node(s): 8
                  NUMA node0 CPU(s): 0-19
                  NUMA node1 CPU(s): 20-39
                  NUMA node2 CPU(s): 40-59
                  NUMA node3 CPU(s): 60-79
                  NUMA node4 CPU(s): 80-99
                  NUMA node5 CPU(s): 100-119
                  NUMA node6 CPU(s): 120-139
                  NUMA node7 CPU(s): 140-159

                  hana2:~ # numactl --hardware
                  available: 8 nodes (0-7)
                  node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
                  node 0 size: 130961 MB
                  node 0 free: 66647 MB
                  node 1 cpus: 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39
                  node 1 size: 131072 MB
                  node 1 free: 38705 MB
                  node 2 cpus: 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59
                  node 2 size: 131072 MB
                  node 2 free: 71668 MB
                  node 3 cpus: 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79
                  node 3 size: 131072 MB
                  node 3 free: 47432 MB
                  node 4 cpus: 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99
                  node 4 size: 131072 MB
                  node 4 free: 68458 MB
                  node 5 cpus: 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119
                  node 5 size: 131072 MB
                  node 5 free: 62218 MB
                  node 6 cpus: 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139
                  node 6 size: 131072 MB
                  node 6 free: 68071 MB
                  node 7 cpus: 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159
                  node 7 size: 131008 MB
                  node 7 free: 47306 MB
                  node distances:
                  node 0 1 2 3 4 5 6 7
                  0: 10 21 21 21 21 21 21 21
                  1: 21 10 21 21 21 21 21 21
                  2: 21 21 10 21 21 21 21 21
                  3: 21 21 21 10 21 21 21 21
                  4: 21 21 21 21 10 21 21 21
                  5: 21 21 21 21 21 10 21 21
                  6: 21 21 21 21 21 21 10 21
                  7: 21 21 21 21 21 21 21 10


                  There you can see the amount of memory present in each NUMA node (CPU socket) and how much of it is used and free.



                  The last section shows the NUMA topology - it shows the "distances" between individual nodes in terms of memory access latencies (the numbers are relative only, they don't represent time in ms or anything). Here you can see the latency to local memory (node 0 accessing memory in 0, node 1 in 1, ...) is 10 while remote latency (node accessing memory on other node) is 21. Although this system is consisting of 4 individual blades, the latency is the same for different socket on the same blade or other blade.



                  Interesting document about NUMA is also at RedHat portal.






                  share|improve this answer













                  Why are you wondering about number of NUMA nodes? The important part is NUMA topology, which says how are those "nodes" connected.



                  I have checked few systems including 8-socket (10-core CPUs) system consisting of 4 interconnected 2-socket blades (Hitachi Compute Node 2000). Also here the number of NUMA nodes is equal to number of CPU sockets (8). This depends on the CPU architecture, mainly its memory bus design.



                  The whole NUMA (non-uniform memory access) defines how can each logical CPU access each part of memory. When you have 2 socket system, each CPU (socket) has its own memory, which it can directly access. But it must also be able to access memory in the other socket - and this of course takes more CPU cycles than accessing local memory. NUMA nodes specifies which part of system memory is local to which CPU. You can have more layers of topology, for example in case of HP Superdome system (which uses Intel Itanium2 CPUs), you have local CPU socket memory, then memory on different socket inside the same cell and then memory in other cells (which have the highest latency).



                  You can configure the NUMA in your system to behave such as to give the best possible performance for your workload. You can for example allow all CPUs to access all memory, or to only access local memory, which then changes how the linux scheduler will distribute processes among the available logical CPUs. If you have many processes requiring not much memory, using only local memory can be benefit, but if you have large processes (Oracle database with its shared memory), using all memory among all cpus might be better.



                  You can use commands such as numastat or numactl --hardware to check NUMA status on your system. Here is info from that 8-socket machine:



                  hana2:~ # lscpu
                  Architecture: x86_64
                  CPU(s): 160
                  Thread(s) per core: 2
                  Core(s) per socket: 10
                  CPU socket(s): 8
                  NUMA node(s): 8
                  NUMA node0 CPU(s): 0-19
                  NUMA node1 CPU(s): 20-39
                  NUMA node2 CPU(s): 40-59
                  NUMA node3 CPU(s): 60-79
                  NUMA node4 CPU(s): 80-99
                  NUMA node5 CPU(s): 100-119
                  NUMA node6 CPU(s): 120-139
                  NUMA node7 CPU(s): 140-159

                  hana2:~ # numactl --hardware
                  available: 8 nodes (0-7)
                  node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
                  node 0 size: 130961 MB
                  node 0 free: 66647 MB
                  node 1 cpus: 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39
                  node 1 size: 131072 MB
                  node 1 free: 38705 MB
                  node 2 cpus: 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59
                  node 2 size: 131072 MB
                  node 2 free: 71668 MB
                  node 3 cpus: 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79
                  node 3 size: 131072 MB
                  node 3 free: 47432 MB
                  node 4 cpus: 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99
                  node 4 size: 131072 MB
                  node 4 free: 68458 MB
                  node 5 cpus: 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119
                  node 5 size: 131072 MB
                  node 5 free: 62218 MB
                  node 6 cpus: 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139
                  node 6 size: 131072 MB
                  node 6 free: 68071 MB
                  node 7 cpus: 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159
                  node 7 size: 131008 MB
                  node 7 free: 47306 MB
                  node distances:
                  node 0 1 2 3 4 5 6 7
                  0: 10 21 21 21 21 21 21 21
                  1: 21 10 21 21 21 21 21 21
                  2: 21 21 10 21 21 21 21 21
                  3: 21 21 21 10 21 21 21 21
                  4: 21 21 21 21 10 21 21 21
                  5: 21 21 21 21 21 10 21 21
                  6: 21 21 21 21 21 21 10 21
                  7: 21 21 21 21 21 21 21 10


                  There you can see the amount of memory present in each NUMA node (CPU socket) and how much of it is used and free.



                  The last section shows the NUMA topology - it shows the "distances" between individual nodes in terms of memory access latencies (the numbers are relative only, they don't represent time in ms or anything). Here you can see the latency to local memory (node 0 accessing memory in 0, node 1 in 1, ...) is 10 while remote latency (node accessing memory on other node) is 21. Although this system is consisting of 4 individual blades, the latency is the same for different socket on the same blade or other blade.



                  Interesting document about NUMA is also at RedHat portal.







                  share|improve this answer












                  share|improve this answer



                  share|improve this answer










                  answered May 19 '15 at 8:41









                  Marki555Marki555

                  780617




                  780617

























                      2














                      No. The number of NUMA nodes does not always equal the number of sockets. For example, an AMD Threadripper 1950X has 1 socket and 2 NUMA nodes while a dual Intel Xeon E5310 system can show 2 sockets and 1 NUMA node.






                      share|improve this answer




























                        2














                        No. The number of NUMA nodes does not always equal the number of sockets. For example, an AMD Threadripper 1950X has 1 socket and 2 NUMA nodes while a dual Intel Xeon E5310 system can show 2 sockets and 1 NUMA node.






                        share|improve this answer


























                          2












                          2








                          2







                          No. The number of NUMA nodes does not always equal the number of sockets. For example, an AMD Threadripper 1950X has 1 socket and 2 NUMA nodes while a dual Intel Xeon E5310 system can show 2 sockets and 1 NUMA node.






                          share|improve this answer













                          No. The number of NUMA nodes does not always equal the number of sockets. For example, an AMD Threadripper 1950X has 1 socket and 2 NUMA nodes while a dual Intel Xeon E5310 system can show 2 sockets and 1 NUMA node.







                          share|improve this answer












                          share|improve this answer



                          share|improve this answer










                          answered Nov 8 '17 at 3:26









                          user3135484user3135484

                          211




                          211






























                              draft saved

                              draft discarded




















































                              Thanks for contributing an answer to Super User!


                              • Please be sure to answer the question. Provide details and share your research!

                              But avoid



                              • Asking for help, clarification, or responding to other answers.

                              • Making statements based on opinion; back them up with references or personal experience.


                              To learn more, see our tips on writing great answers.




                              draft saved


                              draft discarded














                              StackExchange.ready(
                              function () {
                              StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fsuperuser.com%2fquestions%2f916516%2fis-the-amount-of-numa-nodes-always-equal-to-sockets%23new-answer', 'question_page');
                              }
                              );

                              Post as a guest















                              Required, but never shown





















































                              Required, but never shown














                              Required, but never shown












                              Required, but never shown







                              Required, but never shown

































                              Required, but never shown














                              Required, but never shown












                              Required, but never shown







                              Required, but never shown







                              Popular posts from this blog

                              How to reconfigure Docker Trusted Registry 2.x.x to use CEPH FS mount instead of NFS and other traditional...

                              is 'sed' thread safe

                              How to make a Squid Proxy server?