GopherChina 2017
- 1.1 Go在大数据开发中的经验总结 七牛云
- 1.2 Go in TiDB PingCAP
- 1.3 Go coding in go way Neusoft
- 1.4 interface.presented @francesc
- 1.5 NSQ-重塑之路
- 1.6 Aliyun ApasaDB Go 微服务架构 阿里云
- 1.7 Automate App Operation @coreos
- 1.8 Go 微服务实战 @哔哩哔哩
- 2.1 Go 打造亿级实时分布式平台 Grab
- 2.2 Go 语言在讯联扫码支付系统中的成功实践 讯联
- 2.3 Golang 在百万级搜索中的应用 360
- 2.5 跨境电商的 Go 服务治理实践
- 2.6 ContainerOps DevOps Orchestration
- 2.7 Harbor 开源项目容器镜像远程复制的实现
- 2.8 Go 在证券行情系统中的应用
- 2.9 Go 语言在证券期货行情系统中的实践
1.1 Go在大数据开发中的经验总结 七牛云
一站式大数据服务平台 - Pandora
# 成熟而复杂的大数据生态
- 数据可视化
Zeppelin
HUE
Kibana
- 数据检索/分析平台
Apache Spark
Apache Hadoop
Hive
Elastic
- 集群调度
Yarn
Apache MESOS
- 存储/消息队列
Kafka
HDFS
- 数据收集/管道
Logstash
Telegraf
Flume
- 监控
Prometheus
influx data
Grafana
# Pandora 的理念
- 将多样的大数据工具整合
- 将复杂的大数据管理简化
- 构建完整的大数据生命周期闭环
收集
加工
分析
管理
消费
冷冻
# 关键字
『日志』『消息队列』『计算任务』『导出任务』『聚合』『压缩』『时序数据库』『日志检索服务』『对象存储服务』
『Log』『SDK』『WEB』『IoT』『Pipeline』『Transform』『Export』『TSDB』『LogDB』『Parquet』『ORC』『Text gzip』『Grafana』『Kibana』『Xspark 离线分析』『Spark Streaming』『MongoDB』『HTTP』
『实时数据增量』『海量数据导出时延』『数据传输模型 PULL PUSH』『上下游吞吐量』『链路损耗』『网卡』『内存』『网络』
『上下游解耦』『拉取与推送解耦』『数据预取』『队列暂存』『拉取与发送并行』
『任务分割』『水平扩展』『任务标准化:每个任务承载固定的流量』
『资源利用率』『调度』『平衡』
『任务管理』『运维』『运营』『监控』
『buffer』『channel』『process』『thread』『goroutine』『schedule』『admin』『Golang』
『source』『transaction pool』『transaction put queue』『memory queue』『transaction send queue』『sink』『local file queue』『checkpoint sink』『restart workflow』『offset,check』『replay』『task statemachine』
『分布式一致性』『zookeeper/etcd』『最终一致性:pull 系统 + 版本戳』
『平衡调度算法』『Key hash』
『数据重复』『数据丢失』『写入:平滑&毛刺』『低延时』
# protobuf 序列化协议
- 通过 protobuf 协议与上游通信
- 不重复解析数据,去除 json 等解析的 cpu 消耗
# 变长的失败等待时间
- 向下游写入失败,则休眠 1s 再重试,依然失败则休眠时间增加,一直到 10s 为止
- 如写入成功,则失败的休眠时间重置为 1s
- 有效减少下游压力
1.2 Go in TiDB PingCAP
# What is TiDB Scalability
High Availability
SQL
ACID
A Distributed, Consistent, Scablable, SQL Database that supports the best features of both traditional RDBMS and NoSQL.
# Architecture
-
The Whole World
Applications
->Load Balancer (LVS, HAproxy, F5, ...)
->TiDB Servers
->TiKV Cluster
TiDB Server
<->PD Server
<->TiKV Server
-
SQL Layer
-
Protocol Layer
Client
- Packet ->Listener
- Packet ->Connection Context
- Command ->Protocol Decode
-> SQL -> SQL Core Layer -> Data ->Protocol Encode
- Data ->Connection Context
- Packet ->Client
-
SQL Core Layer
Protocol Layer - SQL ->
Session Context
- SQL ->Parser
- AST ->Validator
- AST ->Type infer
- AST ->Logical Optimizer
- Logical Plan ->Physical Optimizer
- Physical Plan ->Executor: Local && Distributed
->TiKV
->Executor: Distributed
- Data ->Session Context
- Data -> Protocol Encode - Data ->Connection Context
- Packet ->Client
-
# Example - SQL
Schema:
CREATE TABLE t (c1 INT, c2 varchar(32)), INDEX idx1 (c1));
Query:
SELECT COUNT(c1) FROM t WHERE c1 > 10 AND c2 = "gopherchina";
-
Logical Plan
AST: SelectStmt Node
->Logica Plan
<> DataSource:from t
-> Where:c1 > 10 and c2 = "gopherchina"
-> Projection:count(c1)
-
Physical Plan
Logical Plan
->Physical Plan
<> IndexScan:idx1: (10, +∞)
-> Filter:c2 = "gopherchina"
-> Aggregation:count(c1)
-
Distributed Physical Plan
Phisical Plan on TiKV: Read Index: idx1: (10, +∞ )
- RowID -> Read Row Data by RowID - Row -> Filter: c2=gopherchina
- Row -> Partial Aggregate: count(c1)
- count(c1)
-> Physical Plan on TiDB: DistSQL Scan - count(c1) -> Final Aggregate: sum(count(c1))
# Challenges in Buiding a Distributed Database
- A very complex distributed system
- A lot of RPC work
- High performance
- Tons of data
- Huge amount of OLTP queries
- Very complex OLAP queries
- Why Go?
goroutine
channel
GC
concurrent
multi-core
# Parallel Data Scan Operator
Executor
- Index Ranges -> IndexScan Executor
: Split Task by range -> Task Pool
<- Pick Task -> Woker Pool
<-> TiKV
IndexScan Executor
- Row IDs -> TableScan Executor
- Tasks -> Task Pool
Worker Pool
- Rows -> TableScan Executor
- Rows -> Executor
# Parallel HashJoin Operator
TiKV
=> Tables: (Left && Right)
Left Table
-> Build HashTable
- Hash Table
Right Table
-*> Join Wokers
-> Left
: Hash Table -> Joined Table
# Goroutine leak : Block profile
Timeout
Context
# Memory && GC
-
Reduce the Number of Allocation
- Get enough memory in one allocation operation
-
Reuse Object
- Introduce a cache in goyacc
- Share a stack for all queries in one session
-
sync.Pool
- Thread safe
- Reuse objects to relieve pressure on the GC
-
gogo/protobuf
- Fast marshalling and unmarshalling
- Fields without pointers cause less time in the garbage collector
-
Monitor the Memory Usage
-
Monitor the memory usage of the server is easy
rutime.MemProfile()
-
Monitor the memory usage of a session is hard
- Account for large memory allocation
- Account for memory consuming operators
-
1.3 Go coding in go way Neusoft
- “Language influences/determines thought” - Sapir-Whorf hypotheisi
- “A language that doesn’t affect the way you think about programming is not worth knowing.” - Alan J. Perlis
1.4 interface.presented @francesc
“interface{} says nothing” - Rob Pike in his Go Proverbs
“The bigger the interface, the weaker the abstraction” - Rob Pike in his Go Proverbs
// what function do you prefer?
// Cons:
// - how would you test it?
// - what if you want to write to memory?
// Pros:
// - ?
func WriteTo(f *os.File) error
// Write, Read, Close:
// Which one does WriteTo really need?
func WriteTo(w io.ReadWriterCloser) error
func WriteTo(w io.Writer) error // winner
// Cons:
// - how do you even write to interface{}?
// - probably requires runtimes checks
// Pros:
// - you can write really bad code
func WriteTo(w interface{}) error
“Be conservative in what you do, be liberal in what you accept from others” - Robustness Principle
“Be conservative in what you send, be liberal in what you accept” - Robustness Principle
“Return concrete types, receive interfaces as paramters, unless hiding immplementaion detail” - Robustness Principle applied to Go (me)
// what function do you prefer?
func New() *os.File // winner
func New() io.ReadWriteCloser
func New() io.Writer
func New() interface{}
1.5 NSQ-重塑之路
MQ
NSQ
Replication
HA
Auto-Balance
Delivery in Order
Tracing
Consume History Messages
Leader
Follower
Writer Buffer
Group Commit
Cursor
Offset
Channel
Optimize channel timeout in Go
timer goroutine
timeout event chan
Worker Goroutine Pool
etcd
State Machine
Jespen
分布式跟踪
分布式测试
1.6 Aliyun ApasaDB Go 微服务架构 阿里云
# Dubbo background
- 分布式 RPC 框架
- Play nice with Java Spring application (J2EE)
- Features:
- 服务动态注册&服务发现
- SOA 服务治理
- 软负载均衡
- 熔断、服务降级
服务分层
服务授权
服务容器
服务编排
软负载均衡
服务质量协定
服务容量评估
服务路由
服务测试
服务降级
服务注册与发现
调度中心
监控中心
注册中心
治理中心
# Micro-services complexity
- Testing is still HARD!
- DevOps culture?
- Security?
- Distributed Tracing?
- Huge payload (Dubbo specific)
Java is “SO DYNAMIC”! Spring
IoC (DI)
AoP
# Profound of Java vs. Golang (Spring vs. Go Tooling)
- Java
- No all Java Applications are Spring Application (and not all Java Developers are Spring Developers)
- Spring is BIG (Spring 2~4), and too much magics happening
- Dubbo’s IDL is a Java interface class
- JVM is a memory hog (0.5~6GB per micro-service JVM)
- Golang
- Simple, Elegant (i.e., defer vs finnaly) and forced to bundle 3rd part sources codes
- Go tooling:
- go test/go test -bench & go tool <pprof/vet/cover/…>
- go-torch (by Uber)
- Memory (<=0.5GB per application container)
# Introducting of gRPC (https://grpc.io/]
- Open sourced version of Google “Stubby RPC”
- IDL for the service APIs
- “HTTP/2” & “Bi-Directional streaming”!
- Working with Protobuf3
- Generated both client and server in 9 languages, offically (other with C language binding are available)
# Dubbo vs. Go kit
Dubbo & Spring | Go kit | |
---|---|---|
Service Discovery & LB | Dubbo Registry & Dubbo Subscriber | github.com/go-kit/kit/sd/(zk/consul/etcd/dnssrv/lb) google.golang.org/grpc/naming (lack of structured versioning) |
“Structured” Logging | Log4j/Slf4j | github.com/go-kit/kit/log |
Metrics | Spring Actuator (many others) | github.com/go-kit/kit/metrics |
Circuit Breaker | Dubbo/Netflix Hystrix | github.com/go-kit/kit/circuitbreaker |
Transports | HTTP(JSON)/Dubbo/(gRPC) | github.com/go-kit/kit/transport/(grpc/http/httprp) |
Caching layer | Dubbo/Spring Cache | - |
Distributed Tracing | ELK/(天象全链路路) | github.com/go-kit/kit/tracing (OpenTracing project) |
# Micro-services best practices
- Design with “Single” domain in mind (DB)
- Strong DevOps culture - CI/CD
- Logging, Metrics and Tracing
- Logging Options - Aliyun Logging Services/Apache Kafka/ELK
- A trace ID to co-relate all the requests that’s been made
- Transactional requests with idempotences handing in mind/Eventual Consistency
- Think twice if you need to propagate your requests to a number of micro-services request in “parallel”
- Provider servies governance and versioning
- Circuit Breaker/Fallbacks
- Multi-region cluster/failover
- Employ Container/Docker technologies (DevOps)
- Docker-compose
- swarm
- k8s
- Be very careful when introduce a whole new set of framework/library (shoot yourself in the foot)
- SIMPLE is the BEST
1.7 Automate App Operation @coreos
main.go
package main
import (
"log"
"net/http"
)
func main() {
fs := http.FileServer(http.Dir("static"))
http.Handle("/", fs)
log.Println("Listening on 0.0.0.0:30080")
http.ListenAndServe("0.0.0.0:30080", nil)
}
Development - idea
+ code
-> 程序
打包
发布
: docker build/docker push
-> Deployment - DNS
LB
# How to Deploy
- Database: PostgreSQL, MySQL, TiDB
- Coordination service: etcd, ZooKeeper
- Streaming: Kafka, Heron
- Big data: Spark, Hadoop
- Storage: Ceph, GlusterFS
- Logging: ElasticSearch
- Monitoring: Prometheus
# etcd Operator
Common Tasks
- Resize
- Upgrade
- Backup
- Failover
Advanced
- Restore
- TLS
- Monitoring/Alerting
# Deploy App Container
- Docker/OCI
- Standard app packaging format
- Kubernetes/Swarm
- Resource scheduling, cluster management
- Operator
- App specific operation automation
- Automation
- Declarative
- Version-controlled
- Cloud-native
- Customizable
- Composable
- App specific operation automation
1.8 Go 微服务实战 @哔哩哔哩
推荐:纽曼(Sam Newman) 的 《微服务设计》
# 微服务演进 分解单块系统
- 梳理业务边界
- 资源隔离部署
- 内外网服务隔离
- RPC 框架
- 序列化 (GOB)
- 上下文管理 (超时控制)
- 拦截器 (鉴权、统计、限流)
- 服务注册 (Zookeeper)
- 负载均衡 (客户端)
- API Gateway
- 统一&聚合协议
- errgroup 并行调用
- 业务隔离
- 熔断、降级、限流等高可用
# 高可用
- 隔离
服务
轻重
物理
- 超时
连接
读取
写入
- 限流
- 流量
accept
connection
thread
- 资源
connection pool
thread pool
- 请求
总数
时间窗口
平滑限流
- 分布式
redis + lua
nginx + lua
- 接入层
nginx limit_req
nginx limit_conn
- 流量
- 降级
链路
自动
手动
- 容错
- 重试容错
简单重试
主备重试
成功率重试
快速失败
- 熔断容错
动态剔除
异常恢复
- 重试容错
# 中间件
- databus (Kafka)
- canal (MySQL Replication)
- bilitw (Twemproxy)
- bfs (facebook haystac, opencv)
- config-service
- dapper (google dapper)
# 持续集成和交付
- 版本管理 (语义化) Semantic Versioning 2.0.0
- 分支管理 (gitlab+mr review) A successful Git branching model
- 环境管理 (集成环境)
- 测试 (单元测试,服务测试)
- 发布 (冒烟、灰度、蓝绿)
# 运维体系
-
服务日志收集、分发、存储、UI
-
分布式跟踪
-
Zabbix
Dapper
ELK: (Elastic Search, Logstash, Kibana)
# 引用&参考
- [英] 纽曼(Sam Newman) (作者), 崔力强 张骏 (译者)《微服务设计》
- http://semver.org/#semantic-versioning-200
- http://nvie.com/posts/a-successful-git-branching-model/
2.1 Go 打造亿级实时分布式平台 Grab
RAILS
NodeJS
amazon web services
Travis CI
GitHub
MySQL
GitHub
MySQL
redis
amazon web services
PHABRICATOR
Jenkins
etcd
k8s
docker
Kafka
Spark
Presto
Amazon KINESIS
DATADOG
SCALYR
LIGHTSTEP
Go
# Distributed Tracing
- 应用场景
- 一个请求耗时三秒才能完成,如何诊断何处耗时最多?
- 如何定位 Single Point of Failure?
- 如何检测并避免循环依赖关系?
- 如何定位 Fan In,Fan Out?
- 实现原理
- 在 API Gateway 生成一个全局唯一的 traceID,并将其注入请求的 Header 里
- 在该请求的每个耗时节点生成一个 spanID,以 traceID+spanID 为索引计时,并记录其他元数据
- 将 tacing 信息自动传入每个耗时操作
- 最后一 traceID 为 key 来聚合所有的诊断信息
-
context.Context
func (s Server) Handler(ctx context.Context, req Request) error { // ... }
- OpenTracing http://opentracing.io
2.2 Go 语言在讯联扫码支付系统中的成功实践 讯联
略…
2.3 Golang 在百万级搜索中的应用 360
C++
C++ -> C -> CGO -> Go
ProtoBuffer
gdb
core dump
简单、有效、够用
连接池
熔断