Currently, Hadoop exposes downstream clients to a variety of third party libraries. As our code base grows and matures we increase the set of libraries we rely on. At the same time, as our user base grows we increase the likelihood that some downstream project will run into a conflict while attempting to use a different version of some library we depend on. While there are hot-button third party libraries that drive most of the development and support issues (e.g. Guava, Apache Commons, and Jackson), a coherent general practice will ensure that we avoid future complications. Simply attempting to coordinate library versions among Hadoop and various downstream projects is untenable, because each project has its own release schedule and often attempts to support multiple versions of other ecosystem projects. Furthermore, our current approach of taking a conservative approach to dependency updates leads to reliance on stale versions of everything. Those stale versions include
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/bin/sh | |
# - (...) はサブプロセス実行 | |
# - & をつけるとコマンドをバックグラウンドプロセスとして起動する | |
# - wait ですべてのバックグラウンドプロセスを待機 | |
# - waitにプロセスidを渡すと指定したプロセスが終了するまで待機したうえで指定プロセスの戻り値を返す | |
# - プロセスidを渡さない場合、すべての子プロセスを待つが終了ステータスはつねに0になる。 | |
# - 子プロセスが0以外を返しても親プロセスはそれを検知できないので子プロセスの死亡を見て親プロセスを殺すのが難しい | |
# - プロセスidを複数渡すことはできない | |
( |
One of the very good design decisions Presto designers made is that it's loosely coupled from storages.
Presto is a distributed SQL executor engine, and doesn't manager schema or metadata of tables by itself. It doesn't manage read data from storage by itself. Those businesses are done by plugins called Connector. Presto comes with Hive connector built-in, which connects Hive's metastore and HDFS to Presto.
We can connect any storages into Presto by writing connector plugins.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
from fabric.api import * | |
from fabric.contrib.files import * | |
env.user = 'your_user' | |
env.host_string = 'your_host' | |
def add_teamcity_user(): | |
runcmd('adduser --system --shell /bin/bash --gecos \'TeamCity Build Control\' --group --disabled-password --home /opt/teamcity teamcity') | |
def download_teamcity(): |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
diff --git a/src/java/org/apache/hadoop/mapred/JobTrackerMetricsInst.java b/src/java/org/apache/hadoop/mapred/JobTrackerMetricsInst.java | |
index 74885a1..a041f28 100644 | |
--- a/src/java/org/apache/hadoop/mapred/JobTrackerMetricsInst.java | |
+++ b/src/java/org/apache/hadoop/mapred/JobTrackerMetricsInst.java | |
@@ -121,8 +121,8 @@ class JobTrackerMetricsInst extends JobTrackerInstrumentation implements Updater | |
metricsRecord.incrMetric("jobs_preparing", numJobsPreparing); | |
metricsRecord.incrMetric("jobs_running", numJobsRunning); | |
- metricsRecord.incrMetric("running_maps", numRunningMaps); | |
- metricsRecord.incrMetric("running_reduces", numRunningReduces); |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/env python | |
# -*- coding: utf-8 -*- | |
"""Simple eventlet POC to check how to get ZMQ sockets working with | |
subprocesses spawned by a simple process.""" | |
import os | |
import eventlet | |
import multiprocessing |