Turing Post

Home
Turing Post

Deliver in-depth knowledge and analysis of AI and ML into your inbox.

🎁 subscribe for free to get your list of essential AI resources 👇

15/06/2024

Cerebras Systems recently launched the WSE-3 chip, featuring 4 trillion transistors and 900,000 AI-optimized compute cores. They also partnered with Dell Technologies to meet growing AI workload demands.

Despite Cerebras Systems' advancements, they remain less known than NVIDIA.

In our AI Infra Unicorn series, we will explore their journey, mission and comparison with NVIDIA, and how they've become valued at over $4 billion.

https://www.turingpost.com/p/cerebras

13/06/2024

Perception and sensory processing are natural for humans but challenging for machines. Alternative methods to scaling and adding more data are needed for human-level intelligence in LLMs. Yann LeCun believes JEPA is the first step.

In our new article about JEPA, we discuss:

- The limitations of LLMs and how to solve them
- How JEPA works
- Models built on JEPA: I-JEPA, MC-JEPA, V-JEPA
- Generalizing JEPA

Check out this method here👇
https://www.turingpost.com/p/jepa

13/06/2024

The most advanced LLMs struggle with 2 problems:

- reasoning and drawing conclusions
- the high costs of training or fine-tuning

Graph RAG, an upgrade to the original RAG technique, solves these issues.

Explore the Graph RAG approach and what it excels at in our article 👀
https://www.turingpost.com/p/graphrag

08/06/2024

Oren Etzioni, founding CEO of the Allen Institute for AI (AI2), now fights political misinformation with TrueMedia.org. This non-profit platform uses AI to detect deepfakes and manipulated media.

Etzioni discusses the following questions in our interview:

- Why did he launch this platform?
- AI and AGI
- Tools that combat misinformation
- How can we balance open-source AI benefits with its risks?

Dive into these insights and more with the link below!
https://www.turingpost.com/p/truemedia

07/06/2024

2nd topic of our new AI 101 series is Mamba selective space state model!

In our article we discuss:

- sequence modeling
- the drawbacks of transformers
- origins of SSMs and Mamba
- why Mamba may be a better alternative to transformer models

Check out this innovative approach 👀
https://www.turingpost.com/p/mamba

25/05/2024

We're introducing our new "AI 101" series to keep you updated on the latest AI developments!

Our 1st article is focused on the Mixture-of-Experts (MoE) model:
- its history
- why this modular architecture is better than one neural network
- MoE and Deep Learning
- its sudden hype with integration into Transformers

Join us to explore more about the MoE fascinating framework 👇https://www.turingpost.com/p/moe

19/05/2024

The AI revolution of the late 2000s was driven by 3 key ingredients:
- ImageNet
- AlexNet
- GPUs

The combination of large datasets and powerful computing led to significant AI breakthroughs, paving the way for models like VGG, GoogLeNet, and ResNet, and driving the current GenAI revolution.

This behind-the-scenes look reveals how the 3 key elements, combined with faith in AI's potential, transformed computer vision and deep learning::
https://www.turingpost.com/p/cvhistory6

18/05/2024

Tech world is excited about Generative AI's potential in coding with tools like GitHub Copilot and OpenAI's Codex!

Eli Hooten discusses the rapid development and widespread adoption of Generative AI (GenAI) which is transforming coding practices.

Tools like GitHub Copilot and ChatGPT boost productivity but need human oversight for quality. Hooten urges developers to use GenAI as an assistant, not a perfect solution, and emphasizes the importance of open-source contributions. He also advises focusing on software fundamentals and exploring system design and usability.

Join us to read the full interview!
https://www.turingpost.com/p/elihooten

10/05/2024

Neglecting spatial intelligence could hinder progress toward artificial general intelligence (AGI).

Spatial intelligence isn't just about maps and layouts. It's becoming increasingly indispensable in AI applications like autonomous vehicles and augmented reality. Enhancing AI's spatial reasoning abilities opens up a world of possibilities for more seamless interaction with the physical environment.

Join us to explore concept of spatial intelligence and its meaning on the path to more intelligent machines.
https://www.turingpost.com/p/cvhistory5

👁 Here are the key issue explored:
- What is spatial intelligence in AI and why is it important?
- Challenges of spatial intelligence in AI.
- Approaches to enhancing spatial intelligence in AI.

How do we enable machines to interact with and navigate the physical world?

27/04/2024

We dive into the 1980s of computer vision — a decade of foundational growth that paved the way for technologies like self-driving cars and face recognition!

The '80s were crucial, not for flashy breakthroughs, but for solid research in object recognition, scene interpretation, and motion detection. The progress was significant, yet the challenge of translating theory into practical application loomed large.

In this episode:

▪️ Edge Detection & Low-Level Processing: The building blocks of how machines perceive edges and shapes.
▪️ Stereo Vision: Understanding how machines interpret 3D space from 2D images.
▪️ Commercial Applications: The beginning of computer vision in the commercial sector.
▪️ Challenges in Vision Systems: Tackling the reliability issues.
▪️ Response to Challenges: How the field adapted and evolved.
▪️ Funding: Who propelled this technology forward?
▪️ AI Steps In: The role of artificial intelligence in advancing CV.

🔗 Dive deeper into our full episode!

From merely describing the physiological properties of visual cells to translating visual processing into algorythms

22/04/2024

MatterGen is a generative model for inorganic materials design.

It employs diffusion modeling to generate novel and potentially stable inorganic material structures.

Unlike previous models, it focuses on generating materials likely to be synthesizable.

MatterGen can also be fine-tuned to create materials with specific desired properties, such as specific chemistry, symmetry, or electronic/magnetic characteristics.

This opens up the potential to accelerate the discovery of new materials for use in energy solutions, catalysis, and advanced technology.

We invited Tian Xie, Principal Research Manager at Microsoft Research, to talk more about MatterGen and his views on the role of AI in the future of sustainable technologies.

Tian Xie from Microsoft Research talks about the transformative power of AI in material science and the future of sustainable technologies

18/04/2024

How did we go from the foundational theories of vision by Hermann von Helmholtz to creating systems that mimic the human eye?

Join us for a thrilling journey through the history of vision in artificial intelligence, where we trace pivotal developments from 1950 to 1970.

https://www.turingpost.com/p/cvhistory2

👁️ Key Milestones Explored:

- The invention of the ophthalmoscope and its impact on medical diagnostics.
- Claude Shannon's groundbreaking communication theory that transformed data transmission.
- The advent of the perceptron by Frank Rosenblatt, which is still influential in AI training today.

What is common between a cat and a first computer vision model?

18/04/2024

We bet that you might not know much about Wiz, yet it's a cybersecurity company currently valued at $10 billion, largely due to its focus on generative AI infrastructure support.

Remarkably, Wiz escalated its Annual Recurring Revenue (ARR) from $1 million to $100 million in just over 18 months and then tripled that figure shortly thereafter.

Who are the minds behind this meteoric rise? What has propelled their phenomenal growth? And how are they leveraging the momentum of GenAI infrastructure demands?

How their focus on customers pains helped them built the $10 billion cybersecurity unicorn in no time

11/04/2024

LLMs are impressive for NLP, but AI won't rely solely on language.

Just like children and animals learn through sight and touch, AI needs computer vision (CV) to understand the physical world.

Follow along to learn more about the importance of CV in Artificial General Intelligence (AGI):

05/04/2024

AI21 introduced Jamba, the world's first production-grade Mamba-based model. This innovative SSM-Transformer hybrid revolutionizes efficiency and performance in language models.

Leveraging Mamba's Structured State Space model technology and the conventional Transformer architecture, Jamba boasts an extensive 256K context window.

Jamba establishes new standards by delivering 3X better throughput on long contexts compared to its competitors and stands out as the sole model in its category capable of accommodating up to 140K context on a single GPU.

AI21 releases Jamba with open weights under Apache 2.0 license. It will soon be available on Hugging Face and NVIDIA's AI catalog.

https://www.ai21.com/blog/announcing-jamba

Learn what is Mamba at https://www.turingpost.com/p/fod46v

04/04/2024

Check our visualization of the FMOps Infrastructure Stack.

Bonus: the best explanatory tokens with a list of open-source tools, libraries, and companies.

Find all the links and hi-res images here: https://www.turingpost.com/p/recap2fmops

04/04/2024

While reward models (RMs) play a crucial role in AI alignment, their evaluation has been largely overlooked. A recent study introduced REWARDBENCH, a benchmark dataset and codebase designed to assess RMs across various tasks.

REWARDBENCH evaluates RMs on prompt-win-lose trios in chat, reasoning, and safety areas, showing their performance on different queries.

Results highlight differences in refusal skills, reasoning, and following instructions.

Main links:

- Leaderboard: https://hf.co/spaces/allenai/reward-bench

- Code: https://github.com/allenai/reward-bench

- Dataset: https://hf.co/datasets/allenai/reward-bench

03/04/2024

It's not every day you see AI and machine learning taking the spotlight in the style of rock stars, but here we are.

- WIRED's reportage on Databricks' DBRX
- Demis Hassabis receiving a knighthood

The AI world is buzzing with celebrity-like fervor: https://www.turingpost.com/p/fod47

03/04/2024

xAI announced Grok-1.5 with a 128,000-token context window.

Improvements in performance benchmarks:

- MATH: 50.6%

- GSM8K: 90%

- HumanEval: 74.1%

A standout feature of Grok-1.5 is its long context understanding, capable of processing up to 128K tokens. This extends its memory capacity, enabling more complex task handling.

Grok-1.5 is powered by a custom distributed training framework utilizing JAX, Rust, and Kubernetes.

The Grok-1.5 development team addressed large model training challenges by optimizing for reliability, automating node management, and improving checkpointing processes.

Check it here: https://x.ai/blog/grok-1.5

02/04/2024

7 months ago, we began systematizing knowledge on foundation models.

Here's a summary of our findings:

- key concepts
- techniques
- resources

👇🏼

Systematizing the knowledge about foundation models that are the backbone of generative AI

01/04/2024

Daniel Jurafsky and James Martin openly released the draft of their book "Speech and Language Processing."

It includes all the book chapters and even slides based on the book's contents.

Find it here: https://web.stanford.edu/~jurafsky/slp3/

30/03/2024

The US media often gets caught up in self-centered narratives, but we firmly believe that learning about global AI affairs, cultural nuances, and political variations can benefit us and our readers a lot.

We offer a comprehensive analysis of the latest developments in Chinese AI, accompanied by notable highlights from Generative AI events held in China during the third and fourth quarters of 2023.

We uncover how China plans to use AI to escape economic stagnation

28/03/2024

We endeavored to systematize the knowledge about foundation models.

Organizing such dynamic content was challenging yet fascinating.

Here is a summary of our primary findings: key concepts, techniques, and resources ->

Systematizing the knowledge about foundation models that are the backbone of generative AI

26/03/2024

What is Mamba and can it beat Transformers?

While everyone is focusing on the hot news about Microsoft's acqui-hiring Inflection AI in disguise and the shakeup at Stability AI, we'd like to concentrate on the exciting developments unfolding in the world of model architectures - Mamba.

We talk about it here: https://www.turingpost.com/p/fod46

11/03/2024

From its humble beginnings as a cryptocurrency mining operation, CoreWeave has emerged as a leading player in the world of cloud computing for AI.

We explored CoreWeave's pivotal history:

How their niche focus, tech stack, and partnerships disrupt the market

07/03/2024

AI is still biased.

For a while, companies hesitated to unleash their chatbots because of that.

Everything changed in 2022 with the launch of ChatGPT.

We explored the bias in current LLMs:

Your guide on how to identify bias, a few debiasing techniques, and collection of tools and libraries for detection and mitigation

01/03/2024

We explore the technical aspects of data privacy within LLM systems.

From concerns about data leakage to the main technical strategies for managing privacy in LLMs.

01/03/2024

GenAI and AGI are reaching “peak hype,” says Gartner. What is capable AI?

AI can be classified into 4 levels, each representing a significant step in functionality. Where does AGI stand in this classification?

1) Artificial Narrow Intelligence (ANI), aka Machine Learning/Deep Learning:

These systems are designed to excel in performing a specific task or function.

Some of today's familiar AI applications are games protein folding resolution, etc.

2) Artificial Capable Intelligence (ACI):

It possesses a broader and highly potent range of impactful capabilities:

The transition from ANI to AGI in global debates is a significant one, and many tend to make this jump without emphasizing the critical ACI stage in between.

The leap between these two categories is enormous, and not appreciated by education researchers.

Several respected figures in the AI world have come out in support of this focus on ACI: Mustafa Suleyman, Yann LeCun, Gary Marcus, Fei-Fei Li, Bill Gates, and even Sam Altman.

The overhype can be explained by three possibilities:

- LLM experts are fervently hoping to see their creation do it all, after decades of toiling against the odds.

- The intense human tendency to anthropomorphize kicks in

- The need to raise enormous amounts of capital

3) Artificial General Intelligence (AGI):

It aims to mimic full human cognitive abilities. These systems would be able to understand, learn, and perform any intellectual task that a human being can. As Gary Marcus states, AGI is “a few paradigm shifts away.”

4) Artificial SuperIntelligence (ASI):

It imagines an AI that can compose masterful symphonies, devise groundbreaking scientific theories, and exhibit empathy and understanding in ways that surpass all human capabilities.

Read more about capable AI in our recent article: https://www.turingpost.com/p/capableai

29/02/2024

DoRA (Weight-Decomposed Low-Rank Adaptation) sets a new standard for optimizing AI models. It combines the benefits of full model fine-tuning and LoRA. How does it do that? Let's see 👇🏼

▪️ The genius of DoRA lies in its unique handling of pre-trained weights.

It separates these weights into two parts:

1. one that determines the size (magnitude)

2. one that determines the orientation (direction) of the weight vectors

▪️ This allows for more nuanced adjustments during fine-tuning, closely mimicking the comprehensive learning achieved through full model fine-tuning but with far fewer parameters needing adjustment.

▪️ Empirical evidence shows that DoRA consistently outperforms LoRA across a variety of tasks, from natural language processing to vision-language benchmarks. DoRA leads to significant performance boosts, proving its effectiveness and efficiency.

▪️ At its core, DoRA utilizes LoRA for directional updates effectively minimizing the number of parameters requiring training. This enables DoRA to exhibit learning behaviors akin to full fine-tuning that was previously out of reach with LoRA alone.

▪️ DoRA shows that with smart adjustments and innovative thinking, we can significantly reduce the resources required for fine-tuning large models without compromising on learning capacity or efficiency.

Paper: https://arxiv.org/abs/2402.09353

Address

Website

https://www.turingpost.com/, https://www.turingpost.com/subscribe, https://www.turingpost.com/c/essential-ai-kit

Alerts

Be the first to know and let us send you an email when Turing Post posts news and promotions. Your email address will not be used for any other purpose, and you can unsubscribe at any time.

Contact The Business

Send a message to Turing Post:

Videos

It's not every day you see AI and machine learning taking the spotlight in the style of rock stars, but here we are. - WIRED's reportage on Databricks' DBRX - Demis Hassabis receiving a knighthood The AI world is buzzing with celebrity-like fervor: https://www.turingpost.com/p/fod47

Meta released the Video Joint Embedding Predictive Architecture (V-JEPA). This model is a leap toward Yann LeCun's vision of advanced machine intelligence. ▪️ Mirroring human learning, largely derived from observational experiences, is how humans naturally learn about their environment. V-JEPA excels in: - Identifying complex object interactions - Interpreting those interactions ▪️ V-JEPA introduces a non-generative model that enhances training and sample efficiency. It predicts video content in an abstract representation space, distinct from pixel-level predictions. This method allows for a significant reduction in the need for labeled data. ▪️ The model's learning process involves: - Masking significant portions of a video to challenge its predictive capabilities - Fostering a deeper "understanding" of spatial and temporal dynamics This technique ensures V-JEPA develops a grasp of various interactions within a scene. ▪️ V-JEPA's design emphasizes: - Abstract representation - Prioritizing conceptual understanding over minute details This approach is particularly effective in "frozen evaluations," where the pre-trained model adapts to new tasks with minimal adjustments. ▪️ V-JEPA is released under a Creative Commons NonCommercial license. Paper: https://ai.meta.com/research/publications/revisiting-feature-prediction-for-learning-visual-representations-from-video/ Code: https://github.com/facebookresearch/jepa

Groq’s Language Processing Unit (LPU) promises significant speed advancements for deploying LLMs. Is it a rival to GPUs? How it works?👇🏼 ▪️ The LPU is a special kind of computer brain designed to handle language tasks very quickly. Unlike other computer chips that do many things at once, the LPU works on tasks one after the other, which is perfect for understanding and generating language. ▪️ The LPU is designed to overcome the two LLM bottlenecks: compute density and memory bandwidth. Groq took a novel approach right from the start, focusing on software and compiler development before even thinking about the hardware. They made sure the software could guide how the chips talk to each other, ensuring they work together seamlessly. This makes the LPU good at processing language efficiently and at high speed. ▪️ This advancement resulted in a highly optimized system that outperformed traditional setups in speed, cost efficiency, and energy consumption. It's significant for industries such as finance, government, and technology, where rapid and precise data processing is essential. ▪️ Now, don't go tossing out your GPUs just yet! While the LPU is a beast when it comes to inference, making light work of applying trained models to new data, GPUs still reign supreme in the training arena. ▪️ The LPU and GPU might become the dynamic duo of AI hardware, each excelling in their respective roles. To better understand architecture, Groq offers two papers 1) https://wow.groq.com/groq-isca-paper-2020 2) https://wow.groq.com/isca-2022-paper/ Learn more about Groq at https://www.turingpost.com/p/fod41

Andrew Ng gave a talk on the opportunities in AI at Stanford University. I summarised the main insights from this talk: 🔸 AI's Potential: Dr. Andrew Ng sees AI as versatile as electricity, capable of transforming various sectors. He highlights supervised learning and generative AI as fundamental tools shaping the AI landscape. 🔸 Rise of Large Language Models Ng highlights the groundbreaking capabilities of LLMs, illustrating how they facilitate rapid application development. He predicts an influx of custom AI applications driven by advancements in prompt-based AI. 🔸 Financial value and opportunities in AI Ng acknowledges the prospect for nascent startups and entrenched corporations to capitalize on AI, amidst the challenges of navigating short-lived trends and the imperative for tangible use cases. 🔸 Expanding AI across industries Ng discusses that AI could revolutionize various industries through tailored AI systems and data-centric approaches. He addresses the hurdles in broadening AI adoption, highlighting the significance of low/no-code tools for customization. 🔸 The role of AI in technology and applications AI has a significant impact on various layers of technology. Professor Ng provides insights on how to harness the power of AI in applications, such as relationship coaching, highlighting the immense untapped market potential. 🔸 Framework for building AI startups Ng shares his methodology, which highlights iterative development, early leadership involvement, and customer engagement as key factors in incubating successful AI startups. 🔸 Ethical considerations Ng emphasizes the significance of pursuing concrete AI initiatives that adhere to ethical standards. He advocates for responsible innovation and support for those affected by the disruptive effects of AI. 🔸 AI risks Addressing concerns surrounding AI, Ng discusses the distant reality of AGI and dismisses unfounded extinction risks. He advocates for the proactive

Code Llama is a family of state-of-the-art, open-access versions of Llama 2 specialized in code tasks. And it's available for commercial use! There are also two fine-tuned versions specifically trained for Python language and instruction following. ▪️ Code Llama supports many of the most popular programming languages: Python, C++, Java, PHP, and more. ▪️ Code Llama – Python is a variation fine-tuned on 100B tokens of Python code. ▪️ Code Llama – Instruct is an instruction fine-tuned and aligned variation. Article: https://ai.meta.com/blog/code-llama-large-language-model-coding/ Code: https://github.com/facebookresearch/codellama Model weights: https://ai.meta.com/llama/ It's also available on the Hugging Face Chat. Other interesting links: Llama 2 learns to code https://huggingface.co/blog/codellama Fine-tune Llama 2 with DPO https://huggingface.co/blog/dpo-trl

CoDeF is a new open-source type of video representation that can ▪️ lift image-to-image translation to video-to-video translation ▪️ lift keypoint detection to keypoint tracking without any training Code: https://github.com/qiuyu96/CoDeF Team: Hao Ouyang, Qiuyu Wang, Yuxi Xiao, Qingyan Bai, Juntao Zhang, Kecheng Zheng, Xiaowei Zhou, Qifeng Chen, Yujun Shen HKUST, Ant Group, CAD&CG, ZJU

Authors of Rift say: Software will soon be written mostly by AI software engineers. They present a server for your personal AI software engineer. With it, you could execute conversational code edits, codebase-wide edits, and more! Features of Rift: ▪️ Conversational code editing ▪️ Codebase-wide edits ▪️ Contextual codebase generation The code is open-source 👇🏼 Code: https://github.com/morph-labs/rift

OptFormer is one of the first Transformer-based frameworks for hyperparameter tuning, learned from large-scale optimization data using flexible text-based representations. Last week, Google AI introduced OptFormer. Google shows that a single Transformer network can: 1. imitate highly complex behaviors from multiple algorithms over long horizons 2. predict objective values very accurately, in many cases surpassing Gaussian Processes Read more here: https://ai.googleblog.com/2022/08/optformer-towards-universal.html

Google AI proposed to leverage large language models to translate language tasks for robots. A novel approach, developed in partnership with Everyday Robots, enables a physical agent to follow high-level textual instructions for physically-grounded tasks. This approach grounds the language model in tasks that are feasible within a specific real-world context. To evaluate the method, Google placed robots in a real kitchen setting and gave them tasks expressed in natural language. Results show that grounding the language model in the real world nearly halves errors over non-grounded baselines. Google AI also releases a robot simulation setup where the research community can test this approach. https://github.com/google-research/google-research/tree/master/saycan

Google AI introduces iterative co-tokenization, a new approach to video-text learning. This approach can efficiently fuse spatial, temporal and language information for video question-answering. It outperforms the previous state-of-the-art by large margins. Simultaneously, the model reduces the required GFLOPs from 150-360 to only 67, producing a highly efficient video question answering model. Read more: https://ai.googleblog.com/2022/08/efficient-video-text-learning-with.html

Google AI open source Rax, a JAX-based library for supervised ranking algorithms. It can be used in, but is not limited to, the following applications: 1. Search 2. Recommendation 3. Question Answering 4. Dialogue System Rax provides state-of-the-art ranking losses, standard ranking metrics, and a set of function transformations. All this functionality is provided with a well-documented and easy-to-use API. Find Rax here: https://github.com/google/rax

Application of the MuZero algorithm to the challenge of video compression. DeepMind collaborates with YouTube to optimize video compression in the open-source VP9 codec. Researchers introduce a novel self-competition-based reward mechanism to solve constrained RL.

2 important ML use-cases from last week: 1. Uber unveiled some details about DeepETA used to predict arrival times 2. TensorFlow team provides details about improving the popular TF-GAN framework

Data2vec is the first SSL model that can work across various domains. It works for images, speech, and text. Meta AI recently open-source: 1. the code 2. pre-trained models Find it here: https://github.com/pytorch/fairseq/tree/main/examples/data2vec

Hopsworks is an open-source feature store. Benefits: 1. Stand-Alone feature store 2. It integrates with any system 3. Online & offline store 4. Hopsworks API 5. DevOps for feature pipelines Find it here: https://thesequence.substack.com/p/-guest-post-the-original-open-source

AutoML-Zero can discover new ML algorithms without major restrictions in the search space. It divides the ML algorithm into 3 functions: 1. Setup Function 2. Learn function 3. Predict Function AutoML-Zero relies on basic mathematical operations instead of building blocks. Maybe soon AutoML-Zero will rediscover algorithms like gradient descent on its own? Read more about AutoML-Zero here: https://thesequence.substack.com/p/thesequence-edge2-automl-automl-zero

TACTO is a fast, flexible, and open-source simulator for tactile sensors. It is released by @facebookai as a simulator for DIGIT, a compact tactile sensor designed for robotic in-hand manipulation. Find the code for the simulator: https://github.com/facebookresearch/tacto

Want your business to be the top-listed Media Company?

Country:

City:

15/06/2024

13/06/2024

13/06/2024

08/06/2024

07/06/2024

25/05/2024

19/05/2024

18/05/2024

10/05/2024

27/04/2024

22/04/2024

18/04/2024

18/04/2024

11/04/2024

05/04/2024

04/04/2024

04/04/2024

03/04/2024

03/04/2024

02/04/2024

01/04/2024

30/03/2024

28/03/2024

26/03/2024

11/03/2024

07/03/2024

01/03/2024

01/03/2024

29/02/2024

Address

Website

Alerts

Contact The Business

Videos

Shortcuts

Share