Google开源了“高维数据可视化工具”Embedding Projector

数据的价值无法估量,但它只有在被使用时才能发挥出来。换言之,收集只是一个开始,而 Google 就是深谙此道的一家公司。通过可视化和仪表板的方式,是消化和呈现数据的一种绝佳方式。今天,Google 将一款相当漂亮的数据可视化工具转为开源项目,它就是“Embedding Projector”。鉴于并非每个人都是数据科学家,如何讲故事将显得尤为重要。


为提供一个更加直观的探索过程,我们开源了 Embedding Projector。

这款交互式可视化 Web 应用程序、兼高维数据分析器,最近还得到了人工智能方面的加成(作为 TensorFlow 的一部分)。

我们现已在 上放出了一个单独的版本,用户无需安装和运行 TensorFlow,即可将其高维数据可视化。


有了 Embedding Projector,你能够以 2D 或 3D 模式查看数据,通过自然的拖拽手势,实现缩放、旋转、固定等操作。

上面是最近一个 TensorFlow 模型的训练演示,其用到了 word2vec 教程。点击视图上的任一点,就可以列出最近点位的表单,显示其学习了哪些语法上的近义词。


Google makes 'Embedding Projector' an open source project

opensourceData can be highly valuable, and no company knows that more than Google. It is constantly collecting a massive amount of it -- it is pretty much how the company butters its bread. Data only has value when it can be used, however, meaning it must ultimately tell a story. In other words, collecting it is only the beginning.

One of the best ways to digest and present data is with visualizations and dashboards. Not everyone is a data scientist, so how you tell a story matters. Today, Google is making a rather nifty data visualization tool an open source project. Called "Embedding Projector", it can show what the search giant calls "high-dimensional data".

"To enable a more intuitive exploration process, we are open-sourcing the Embedding Projector, a web application for interactive visualization and analysis of high-dimensional data recently shown as an A.I. Experiment, as part of TensorFlow. We are also releasing a standalone version at, where users can visualize their high-dimensional data without the need to install and run TensorFlow", says Google.

The search giant further shares, "with the Embedding Projector, you can navigate through views of data in either a 2D or a 3D mode, zooming, rotating, and panning using natural click-and-drag gestures. Below is a figure showing the nearest points to the embedding for the word “important” after training a TensorFlow model using the word2vec tutorial. Clicking on any point (which represents the learned embedding for a given word) in this visualization, brings up a list of nearest points and distances, which shows which words the algorithm has learned to be semantically related. This type of interaction represents an important way in which one can explore how an algorithm is performing".

Comments are closed.