cni0 is a Linux network bridge device, all veth devices will connect to this bridge, so all Pods on the same node can communicate with each other, as explained in Kubernetes Network Model and the hotel analogy above.
[root@az3-k8s-13 ~]# ip route |grep tunl0 10.122.17.64/26 via 10.122.127.128 dev tunl0 //这条路由不通 [root@az3-k8s-13 ~]# ip route del 10.122.17.64/26 via 10.122.127.128 dev tunl0 ; ip route add 10.122.17.64/26 via 192.168.3.110 dev tunl0 proto bird onlink
[root@az3-k8s-13 ~]# ip route |grep tunl0 10.122.17.64/26 via 192.168.3.110 dev tunl0 proto bird onlink //这样就通了
ip route del 192.168.0.0/24 dev eth0 proto kernel scope link src 192.168.0.113 //同时将默认路由改到3.113 ip route del default via 192.168.0.253 dev eth0; ip route add default via 192.168.3.253 dev eth1
最终OK后,node4上的ip route是这样的:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
[root@az3-k8s-14 ~]# ip route default via 192.168.3.253 dev eth1 10.122.17.64/26 via 192.168.3.110 dev tunl0 proto bird onlink 10.122.124.128/26 via 192.168.0.111 dev tunl0 proto bird onlink 10.122.127.128/26 via 192.168.3.112 dev tunl0 proto bird onlink blackhole 10.122.157.128/26 proto bird 10.122.157.129 dev cali19f6ea143e3 scope link 10.122.157.130 dev cali09e016ead53 scope link 10.122.157.131 dev cali0ad3225816d scope link 10.122.157.132 dev cali55a5ff1a4aa scope link 10.122.157.133 dev cali01cf8687c65 scope link 10.122.157.134 dev cali65232d7ada6 scope link 10.122.173.128/26 via 192.168.3.114 dev tunl0 proto bird onlink 172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1 192.168.3.0/24 dev eth1 proto kernel scope link src 192.168.3.113
Package leaves *cni0*and is redirected to the *veth6*virtual interface;
Package leaves the *root netns*through *veth6*and reaches the *Pod 6 netns*though the *eth6*interface;
cni0 is a Linux network bridge device, all veth devices will connect to this bridge, so all Pods on the same node can communicate with each other, as explained in Kubernetes Network Model and the hotel analogy above.
Flannel is a simple and easy way to configure a layer 3 network fabric designed for Kubernetes.
Flannel 工作原理
Flannel runs a small, single binary agent calledflanneld on each host, and is responsible for allocating a subnet lease to each host out of a larger, preconfigured address space. Flannel uses either the Kubernetes API or etcd directly to store the network configuration, the allocated subnets, and any auxiliary data (such as the host’s public IP). Packets are forwarded using one of several backend mechanisms including VXLAN and various cloud integrations.
ARP (Address Resolution Protocol) table is used by a Layer 3 device (router, switch, server, desktop) to store the IP address to MAC address entries for a specific network device.
The FDB (forwarding database) table is used by a Layer 2 device (switch/bridge) to store the MAC addresses that have been learned and which ports that MAC address was learned on. The MAC addresses are learned through transparent bridging on switches and dedicated bridges.
The “admin prohibited filter” seen in the tcpdump output means there is a firewall blocking a connection. It does it by sending back an ICMP packet meaning precisely that: the admin of that firewall doesn’t want those packets to get through. It could be a firewall at the destination site. It could be a firewall in between. It could be iptables on the Linux system.
发现有问题的环境中宿主机的防火墙设置报错了:
1 2
12月 28 23:35:08 hygon253 firewalld[10493]: WARNING: COMMAND_FAILED: '/usr/sbin/iptables -w10 -t filter -X DOCKER-ISOLATION-STAGE-1' failed: iptables: No chain/target/match by that name. 12月 28 23:35:08 hygon253 firewalld[10493]: WARNING: COMMAND_FAILED: '/usr/sbin/iptables -w10 -t filter -F DOCKER-ISOLATION-STAGE-2' failed: iptables: No chain/target/match by that name.
应该是因为启动docker的时候 firewalld 是运行着的
Do you have firewalld enabled, and was it (re)started after docker was started? If so, then it’s likely that firewalld wiped docker’s IPTables rules. Restarting the docker daemon should re-create those rules.
5: enp125s0f3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000 link/ether 64:2c:ac:e9:78:3d brd ff:ff:ff:ff:ff:ff inet 1.1.1.198/25 brd 1.1.1.255 scope global dynamic noprefixroute enp125s0f3 valid_lft 12463sec preferred_lft 12463sec inet6 fe80::859a:7861:378e:d6ac/64 scope link noprefixroute valid_lft forever preferred_lft forever 6: enp2s0f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000 link/ether 0c:42:a1:4f:d1:e2 brd ff:ff:ff:ff:ff:ff inet 192.168.0.1/24 brd 192.168.0.255 scope global noprefixroute enp2s0f0 valid_lft forever preferred_lft forever #ip route default via 1.1.1.254 dev enp125s0f3 proto dhcp metric 101 1.1.1.128/25 dev enp125s0f3 proto kernel scope link src 1.1.1.198 metric 101 172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1 linkdown 172.19.0.0/24 dev cni0 proto kernel scope link src 172.19.0.1 172.19.2.0/24 via 172.19.2.0 dev flannel.1 onlink 172.19.3.0/24 via 172.19.3.0 dev flannel.1 onlink 192.168.0.0/24 dev enp2s0f0 proto kernel scope link src 192.168.0.1 metric 100
解决办法:真正生效的是 flannel.1 中的地址
1 2 3 4 5
//比如 flannel 选用了以下公网ip(默认路由上的ip)导致flannel网络不通,应该选内网ip #ip -details link show flannel.1 29: flannel.1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UNKNOWN mode DEFAULT group default link/ether 96:ad:e2:29:29:09 brd ff:ff:ff:ff:ff:ff promiscuity 0 minmtu 68 maxmtu 65535 vxlan id 1 local 30.1.1.1 dev eno1 srcport 0 0 dstport 8472 nolearning ttl auto ageing 300 udpcsum noudp6zerocsumtx noudp6zerocsumrx addrgenmode eui64 numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535
//然后会看到flannel.1的地址用的是enp33s0f0(192.168.0.1) #ip -details link show flannel.1 40: flannel.1:<BROADCAST,MULTICAST,UP,LOWER_UP>mtu1450 qdiscnoqueuestateUNKNOWNmodeDEFAULTgroupdefault link/ether92:5c:b2:af:37:62brdff:ff:ff:ff:ff:ffpromiscuity0minmtu68maxmtu65535 vxlanid1local192.168.0.1devenp2s0f0srcport00dstport8472 nolearningttlautoageing300udpcsumnoudp6zerocsumtxnoudp6zerocsumrxaddrgenmodeeui64numtxqueues1numrxqueues1gso_max_size65536gso_max_segs65535
If you happen to have different interfaces to be matched, you can match it on a regex pattern. Let’s say the worker nodes could’ve enp0s8 or enp0s9 configured, then the flannel args would be — --iface-regex=[enp0s8|enp0s9]
// ip netns 获取容器网络信息 1022 [2021-04-14 15:53:06] docker inspect -f '{{.State.Pid}}' ab4e471edf50 //获取容器进程id 1023 [2021-04-14 15:53:30] ls /proc/79828/ns/net 1024 [2021-04-14 15:53:57] ln -sfT /proc/79828/ns/net /var/run/netns/ab4e471edf50 //link 以便ip netns List能访问 // 宿主机上查看容器ip 1026 [2021-04-14 15:54:11] ip netns list 1028 [2021-04-14 15:55:19] ip netns exec ab4e471edf50 ifconfig //nsenter 调试网络 Get the pause container's sandboxkey: root@worker01:~# docker inspect k8s_POD_ubuntu-5846f86795-bcbqv_default_ea44489d-3dd4-11e8-bb37-02ecc586c8d5_0 | grep SandboxKey "SandboxKey": "/var/run/docker/netns/82ec9e32d486", root@worker01:~# Now, using nsenter you can see the container's information. root@worker01:~# nsenter --net=/var/run/docker/netns/82ec9e32d486 ip addr show 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever 3: eth0@if7: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UP group default link/ether 0a:58:0a:f4:01:02 brd ff:ff:ff:ff:ff:ff link-netnsid 0 inet 10.244.1.2/24 scope global eth0 valid_lft forever preferred_lft forever Identify the peer_ifindex, and finally you can see the veth pair endpoint in root namespace. root@worker01:~# nsenter --net=/var/run/docker/netns/82ec9e32d486 ethtool -S eth0 NIC statistics: peer_ifindex: 7 root@worker01:~# root@worker01:~# ip -d link show | grep '7: veth' 7: veth5e43ca47@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue master cni0 state UP mode DEFAULT group default root@worker01:~#
(combined from similar events): Failed create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "f7aa44bf81b27bf0ff6c02339df2d2743cf952c1519fead4c563892d2d41a979" network for pod "nginx-deployment-6c8c86b759-f8fb7": NetworkPlugin cni failed to set up pod "nginx-deployment-6c8c86b759-f8fb7_default" network: failed to set bridge addr: "cni0" already has an IP address different from 172.19.2.1/24
可以铲掉网卡重新分配,或者给cni重新分配错误信息提示的ip
1
ifconfig cni0 172.19.2.1/24
or
1 2 3
ip link set cni0 down && ip link set flannel.1 down ip link delete cni0 && ip link delete flannel.1 systemctl restart containerd && systemctl restart kubelet
1004 [2021-10-27 10:49:08] ip netns add ren 1005 [2021-10-27 10:49:12] ip netns show 1006 [2021-10-27 10:49:22] ip netns exec ren route //为空 1007 [2021-10-27 10:49:29] ip netns exec ren iptables -L 1008 [2021-10-27 10:49:55] ip link add veth1 type veth peer name veth1_p //此时宿主机上能看到这两块网卡 1009 [2021-10-27 10:50:07] ip link set veth1 netns ren //将veth1从宿主机默认网络空间挪到ren中,宿主机中看不到veth1了 1010 [2021-10-27 10:50:18] ip netns exec ren route 1011 [2021-10-27 10:50:25] ip netns exec ren iptables -L 1012 [2021-10-27 10:50:39] ifconfig 1013 [2021-10-27 10:50:51] ip link list 1014 [2021-10-27 10:51:29] ip netns exec ren ip link list 1017 [2021-10-27 10:53:27] ip netns exec ren ip addr add 172.19.0.100/24 dev veth1 1018 [2021-10-27 10:53:31] ip netns exec ren ip link list 1019 [2021-10-27 10:53:39] ip netns exec ren ifconfig 1020 [2021-10-27 10:53:42] ip netns exec ren ifconfig -a 1021 [2021-10-27 10:54:13] ip netns exec ren ip link set dev veth1 up 1022 [2021-10-27 10:54:16] ip netns exec ren ifconfig 1023 [2021-10-27 10:54:22] ping 172.19.0.100 1024 [2021-10-27 10:54:35] ifconfig -a 1025 [2021-10-27 10:55:03] ip netns exec ren ip addr add 172.19.0.101/24 dev veth1_p 1026 [2021-10-27 10:55:10] ip addr add 172.19.0.101/24 dev veth1_p 1027 [2021-10-27 10:55:16] ifconfig veth1_p 1028 [2021-10-27 10:55:30] ip link set dev veth1_p up 1029 [2021-10-27 10:55:32] ifconfig veth1_p 1030 [2021-10-27 10:55:38] ping 172.19.0.101 1031 [2021-10-27 10:55:43] ping 172.19.0.100 1032 [2021-10-27 10:55:53] ip link set dev veth1_p down 1033 [2021-10-27 10:55:54] ping 172.19.0.100 1034 [2021-10-27 10:55:58] ping 172.19.0.101 1035 [2021-10-27 10:56:08] ifconfig veth1_p 1036 [2021-10-27 10:56:32] ping 172.19.0.101 1037 [2021-10-27 10:57:04] ip netns exec ren route 1038 [2021-10-27 10:57:52] ip netns exec ren ping 172.19.0.101 1039 [2021-10-27 10:57:58] ip link set dev veth1_p up 1040 [2021-10-27 10:57:59] ip netns exec ren ping 172.19.0.101 1041 [2021-10-27 10:58:06] ip netns exec ren ping 172.19.0.100 1042 [2021-10-27 10:58:14] ip netns exec ren ifconfig 1043 [2021-10-27 10:58:19] ip netns exec ren route 1044 [2021-10-27 10:58:26] ip netns exec ren ping 172.19.0.100 -I veth1 1045 [2021-10-27 10:58:58] ifconfig veth1_p 1046 [2021-10-27 10:59:10] ping 172.19.0.100 1047 [2021-10-27 10:59:26] ip netns exec ren ping 172.19.0.101 -I veth1 把网卡加入到docker0的bridge下 1160 [2021-10-27 12:17:37] brctl show 1161 [2021-10-27 12:18:05] ip link set dev veth3_p master docker0 1162 [2021-10-27 12:18:09] ip link set dev veth1_p master docker0 1163 [2021-10-27 12:18:13] ip link set dev veth2 master docker0 1164 [2021-10-27 12:18:15] brctl show brctl showmacs br0 brctl show cni0 brctl addif cni0 veth1 veth2 veth3 //往cni bridge添加多个容器peer 网卡
Linux 上存在一个默认的网络命名空间,Linux 中的 1 号进程初始使用该默认空间。Linux 上其它所有进程都是由 1 号进程派生出来的,在派生 clone 的时候如果没有额外特别指定,所有的进程都将共享这个默认网络空间。
所有的网络设备刚创建出来都是在宿主机默认网络空间下的。可以通过 ip link set 设备名 netns 网络空间名 将设备移动到另外一个空间里去,socket也是归属在某一个网络命名空间下的,由创建socket进程所在的netns来决定socket所在的netns
1 2 3 4 5 6 7 8 9 10 11 12
//file: net/socket.c intsock_create(int family, int type, int protocol, struct socket **res) { return __sock_create(current->nsproxy->net_ns, family, type, protocol, res, 0); }
78.692839007 seconds time elapsed #dmidecode -t processor # dmidecode 3.0 Getting SMBIOS data from sysfs. SMBIOS 3.2.0 present. # SMBIOS implementations newer than version 3.0 are not # fully supported by this version of dmidecode.
Handle 0x0004, DMI type 4, 48 bytes Processor Information Socket Designation: BGA3576 Type: Central Processor Family: <OUT OF SPEC> Manufacturer: PHYTIUM ID: 00 00 00 00 70 1F 66 22 Version: S2500 Voltage: 0.8 V External Clock: 50 MHz Max Speed: 2100 MHz Current Speed: 2100 MHz Status: Populated, Enabled Upgrade: Other L1 Cache Handle: 0x0005 L2 Cache Handle: 0x0007 L3 Cache Handle: 0x0008 Serial Number: N/A Asset Tag: No Asset Tag Part Number: NULL Core Count: 64 Core Enabled: 64 Thread Count: 64 Characteristics: 64-bit capable Multi-Core Hardware Thread Execute Protection Enhanced Virtualization Power/Performance Control
#./mlc Intel(R) Memory Latency Checker - v3.9 Measuring idle latencies (in ns)... Numa node Numa node 0 0 145.8
Measuring Peak Injection Memory Bandwidths for the system Bandwidths are in MB/sec (1 MB/sec = 1,000,000 Bytes/sec) Using all the threads from each core if Hyper-threading is enabled Using traffic with the following read-write ratios ALL Reads : 110598.7 3:1 Reads-Writes : 93408.5 2:1 Reads-Writes : 89249.5 1:1 Reads-Writes : 64137.3 Stream-triad like: 77310.4
Measuring Memory Bandwidths between nodes within system Bandwidths are in MB/sec (1 MB/sec = 1,000,000 Bytes/sec) Using all the threads from each core if Hyper-threading is enabled Using Read-only traffic type Numa node Numa node 0 0 110598.4
Measuring Loaded Latencies for the system Using all the threads from each core if Hyper-threading is enabled Using Read-only traffic type Inject Latency Bandwidth Delay (ns) MB/sec ========================== 00000 506.00 111483.5 00002 505.74 112576.9 00008 505.87 112644.3 00015 508.96 112643.6 00050 574.36 112701.5 00100 501.32 112775.9 00200 475.47 112839.3 00300 224.52 91560.4 00400 194.54 70515.6 00500 185.13 57233.2 00700 178.71 41591.6 01000 170.46 29524.1 01300 165.43 22933.2 01700 164.33 17702.9 02500 164.14 12206.9
两个值都为on 时的mlc 测试结果
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
#./mlc Intel(R) Memory Latency Checker - v3.9 Measuring idle latencies (in ns)... Numa node Numa node 0 1 0 81.6 145.9 1 144.9 81.2
Measuring Peak Injection Memory Bandwidths for the system Bandwidths are in MB/sec (1 MB/sec = 1,000,000 Bytes/sec) Using all the threads from each core if Hyper-threading is enabled Using traffic with the following read-write ratios ALL Reads : 227204.2 3:1 Reads-Writes : 212432.5 2:1 Reads-Writes : 210423.3 1:1 Reads-Writes : 196677.2 Stream-triad like: 189691.4
//抓取一个子网范围 tcpdump -i bond0 port 3001 and net 1.2.3.0/24 and host not 1.2.3.211 -nn -X
//抓取 DNAT 包,tcp options 里面的 246 代表 DNAT tcpdump -nn –vvv -i eth0 tcp dst port 3306 and '(tcp[tcpflags] & (tcp-syn) != 0) and (tcp[20] =246) '
//在上面的基础上,抓取指定 vip:10.142.*.* tcpdump -nn –vvv -i eth0 tcp dst port 3306 and '(tcp[tcpflags] & (tcp-syn) != 0) and tcp[20]=246 and tcp[24]=10 and tcp[25]=142'
//抓取 DNAT 包,tcp options 里面的 252 代表 DNAT tcpdump -nn –vvv -i eth0 tcp dst port 3306 and '(tcp[tcpflags] & (tcp-ack) != 0) and (tcp[20] =252) '
//根据指定的VPC IP抓包,例如172.16.x.x tcpdump -nn –vvv -i eth0 tcp dst port 3306 and '(tcp[tcpflags] & (tcp-ack) != 0) and (tcp[32] =172) and (tcp[33] =16)'
//根据客户端IP抓包FNAT的包,例如172.16.x.x tcpdump -nn –vvv -i eth0 tcp dst port 3306 and '(tcp[tcpflags] & (tcp-ack) != 0) and(tcp[20]=252) and (tcp[24]=172) and (tcp[25]=16)'
用tcpdump抓取并保存包: sudo tcpdump -i eth0 port 3306 -w plantegg.cap
举例来说,对于一个实现两个 int 相加的 expression,在向量化之前,其实现可能是这样的:
1 2 3 4 5 6 7
classExpressionIntAdd extends Expression { Datum eval(Row input){ int left = input.getInt(leftIndex); int right = input.getInt(rightIndex); returnnewDatum(left+right); } }
EXEC : instructions per nominal CPU cycle IPC : instructions per CPU cycle FREQ : relation to nominal CPU frequency='unhalted clock ticks'/'invariant timer ticks' (includes Intel Turbo Boost) AFREQ : relation to nominal CPU frequency while in active state (not in power-saving C state)='unhalted clock ticks'/'invariant timer ticks while in C0-state' (includes Intel Turbo Boost) L3MISS: L3 (read) cache misses L3MPKI: L3 misses per kilo instructions L3HIT : L3 (read) cache hit ratio (0.00-1.00) L2DMISS:L2 data cache misses L2DHIT :L2 data cache hit ratio (0.00-1.00) L2DMPKI:number of L2 data cache misses per kilo instruction L2IMISS:L2 instruction cache misses L2IHIT :L2 instructoon cache hit ratio (0.00-1.00) L2IMPKI:number of L2 instruction cache misses per kilo instruction L2MPKI :number of both L2 instruction and data cache misses per kilo instruction
--------------------------------------------------------------------------------------------------------------- TOTAL * 1.29 1.20 1.08 1.00 12 M 0.73 0.04 10 M 0.87 0.03 0.07 19 M 0.00 0.55 N/A
Instructions retired: 336 G ; Active cycles: 281 G ; Time (TSC): 2082 Mticks ; C0 (active,non-halted) core residency: 107.90 %
PHYSICAL CORE IPC : 2.39 => corresponds to 34.14 % utilization for cores in active state Instructions per nominal CPU cycle: 2.58 => corresponds to 36.84 % core utilization over time interval ---------------------------------------------------------------------------------------------------------------
EXEC : instructions per nominal CPU cycle IPC : instructions per CPU cycle FREQ : relation to nominal CPU frequency='unhalted clock ticks'/'invariant timer ticks' (includes Intel Turbo Boost) AFREQ : relation to nominal CPU frequency while in active state (not in power-saving C state)='unhalted clock ticks'/'invariant timer ticks while in C0-state' (includes Intel Turbo Boost) L3MISS: L3 (read) cache misses L3MPKI: L3 misses per kilo instructions L3HIT : L3 (read) cache hit ratio (0.00-1.00) L2DMISS:L2 data cache misses L2DHIT :L2 data cache hit ratio (0.00-1.00) L2DMPKI:number of L2 data cache misses per kilo instruction L2IMISS:L2 instruction cache misses L2IHIT :L2 instructoon cache hit ratio (0.00-1.00) L2IMPKI:number of L2 instruction cache misses per kilo instruction L2MPKI :number of both L2 instruction and data cache misses per kilo instruction
0 0 1.34 1.26 1.06 1.00 8901 K 0.72 3.15 15 M 0.68 5.43 8.58 71 M 4.00 0.60 N/A 1 0 1.42 1.33 1.06 1.00 8491 K 0.73 2.83 14 M 0.68 4.67 7.50 71 M 4.00 0.60 N/A 2 0 1.41 1.33 1.06 1.00 8206 K 0.74 2.75 12 M 0.72 4.25 7.00 71 M 4.00 0.60 N/A 3 0 1.46 1.38 1.06 1.00 7464 K 0.75 2.40 11 M 0.68 3.81 6.21 71 M 4.00 0.60 N/A 4 0 1.31 1.24 1.06 1.00 9118 K 0.71 3.28 15 M 0.69 5.61 8.88 70 M 4.00 0.61 N/A 5 0 1.41 1.33 1.06 1.00 8700 K 0.74 2.92 13 M 0.69 4.66 7.57 70 M 4.00 0.61 N/A 6 0 1.41 1.33 1.06 1.00 8094 K 0.74 2.79 12 M 0.70 4.40 7.18 70 M 4.00 0.61 N/A 7 0 1.43 1.35 1.06 1.00 7873 K 0.74 2.68 12 M 0.71 4.13 6.81 70 M 4.00 0.61 N/A 8 0 1.44 1.36 1.06 1.00 8544 K 0.73 2.79 14 M 0.67 4.87 7.66 20 M 1.00 0.61 N/A 9 0 1.24 1.16 1.06 1.00 524 K 0.51 0.21 86 K 0.94 0.03 0.24 20 M 1.00 0.61 N/A 10 0 1.26 1.18 1.07 1.00 379 K 0.50 0.15 60 K 0.95 0.02 0.17 20 M 1.00 0.61 N/A 11 0 1.24 1.16 1.07 1.00 533 K 0.50 0.20 96 K 0.94 0.04 0.24 20 M 1.00 0.61 N/A 12 0 1.22 1.14 1.07 1.00 1180 K 0.34 0.47 98 K 0.94 0.04 0.51 3872 K 0.12 0.46 N/A 13 0 1.24 1.16 1.07 1.00 409 K 0.49 0.16 64 K 0.94 0.03 0.19 3872 K 0.12 0.46 N/A --------------------------------------------------------------------------------------------------------------- SKT 0 1.18 1.11 1.06 1.00 113 M 0.67 0.73 139 M 0.71 0.89 1.62 186 M 1.12 0.59 N/A SKT 1 1.23 1.14 1.08 1.00 33 M 0.53 0.21 11 M 0.89 0.07 0.28 38 M 0.12 0.45 N/A --------------------------------------------------------------------------------------------------------------- TOTAL * 1.21 1.13 1.07 1.00 147 M 0.65 0.46 150 M 0.74 0.47 0.93 224 M 0.62 0.57 N/A
Instructions retired: 319 G ; Active cycles: 283 G ; Time (TSC): 2108 Mticks ; C0 (active,non-halted) core residency: 107.12 %
PHYSICAL CORE IPC : 2.25 => corresponds to 32.18 % utilization for cores in active state Instructions per nominal CPU cycle: 2.41 => corresponds to 34.48 % core utilization over time interval ---------------------------------------------------------------------------------------------------------------
Cleaning up Zeroed PMU registers
Apple M1
The M1
The critically-acclaimed M1 processor delivers:
16 billion transistors and a 119mm squared-die size.
//compile:gcc -o simd -DCLS=$(getconf LEVEL1_DCACHE_LINESIZE) ./simd.c // int main (void) { // ... Initialize mul1 and mul2 int i, i2, j, j2, k, k2;
for (i = 0; i < N; ++i) for (j = 0; j < N; ++j) tmp[i][j] = mul2[j][i]; //先转置 for (i = 0; i < N; ++i) for (j = 0; j < N; ++j) for (k = 0; k < N; ++k) res[i][j] += mul1[i][k] * tmp[j][k]; //转置后按行访问,对内存友好
Floating-point fused Multiply-Add (scalar). This instruction multiplies the values of the first two SIMD&FP source registers, adds the product to the value of the third SIMD&FP source register, and writes the result to the SIMD&FP destination register.
#define ONE p = (char **)*p; #define FIVE ONE ONE ONE ONE ONE #define TEN FIVE FIVE #define FIFTY TEN TEN TEN TEN TEN #define HUNDRED FIFTY FIFTY
static void usage() { printf("Usage: ./mem-lat -b xxx -n xxx -s xxx\n"); printf(" -b buffer size in KB\n"); printf(" -n number of read\n\n"); printf(" -s stride skipped before the next access\n\n"); printf("Please don't use non-decimal based number\n"); }
int main(int argc, char* argv[]) { unsigned long i, j, size, tmp; unsigned long memsize = 0x800000; /* 1/4 LLC size of skylake, 1/5 of broadwell */ unsigned long count = 1048576; /* memsize / 64 * 8 */ unsigned int stride = 64; /* skipped amount of memory before the next access */ unsigned long sec, usec; struct timeval tv1, tv2; struct timezone tz; unsigned int *indices;
while (argc-- > 0) { if ((*argv)[0] == '-') { /* look at first char of next */ switch ((*argv)[1]) { /* look at second */ case 'b': argv++; argc--; memsize = atoi(*argv) * 1024; break;
case 'n': argv++; argc--; count = atoi(*argv); break;
case 's': argv++; argc--; stride = atoi(*argv); break;
// trick 2: fill mem with pointer references for (i = 0; i < size - 1; i++) *(char **)&mem[indices[i]*stride]= (char*)&mem[indices[i+1]*stride]; *(char **)&mem[indices[size-1]*stride]= (char*)&mem[indices[0]*stride];
为磁芯存储器画上句号的是集成电路随机存储器件。1966年,IBM Thomas J. Watson研究中心的Dr. Robert H. Dennard开发出了单个单元的动态随机存储器DRAM,DRAM每个单元包含一个开关晶体管和一个电容,利用电容中的电荷存储数据。因为电容中的电荷会泄露,需要每个周期都进行刷新重新补充电量,所以称其为动态随机存储器。
我们俗称DDR4-2666实际指的是等效频率,是通过上升下降沿进行数据预取放大后的实际数据传输频率,DDR4 prefetch是8,通过bank group提升到核心频率的16倍,所以DDR4的最低起频是133.333MHz*16=2133MHz。DDR(Double Data Rate)因为是在一个时钟周期的上升沿和下降沿个执行预取,所以时钟频率=等效频率/2