The Academic & Religious Impact of Wave Function of the Universe

Integration

Academic

Intro

Author

Kian Ghodoussi

Published

March 24, 2025

In the following notebook, we will be recreating Sturdy Statistics’ DeepDive page using the sturdy-stats-sdk.

In this notebook we will reproducing this deep dive analysis on every publication that cites the paper Wave function of the Universe authored by JB Hartle and Stephen Hawking in 1960. This paper spawned a series of downstream research fields and also had a sizeable impact on the relationship between religion and science, which we will begin to explore in this walkthrough.

Prerequisites

pip install sturdy-stats-sdk pandas numpy plotly

Code

from IPython.display import display, Markdown, Latex
import pandas as pd
import numpy as np
import plotly.express as px
from sturdystats import Index, Job

from pprint import pprint

Code

## Basic Utilities
px.defaults.template = "simple_white"  # Change the template
px.defaults.color_discrete_sequence = px.colors.qualitative.Dark24 # Change color sequence

def procFig(fig, **kwargs):
    fig.update_layout(plot_bgcolor= "rgba(0, 0, 0, 0)", paper_bgcolor= "rgba(0, 0, 0, 0)",
        margin=dict(l=0,r=0,b=0,t=30,pad=0),
        title_x=.5,
        **kwargs
    )
    fig.layout.xaxis.fixedrange = True
    fig.layout.yaxis.fixedrange = True
    return fig

def displayText(df, highlight):
    def processText(row):
        t = "\n".join([ f'1. {r["short_title"]}: {int(r["prevalence"]*100)}%' for r in row["paragraph_topics"][:5] ])
        x = row["text"]
        res = []
        for word in x.split(" "):
            for term in highlight:
                if term in word.lower() and "**" not in word:
                    word = "**"+word+"**"
            res.append(word)
        return f"<em>\n\n#### Result {row.name+1}/{df.index.max()+1}\n\n#### {row['published']}\n\n"+ t +"\n\n" + " ".join(res) + "</em>"

    res = df.apply(processText, axis=1).tolist()       
    display(Markdown(f"\n\n...\n\n".join(res)))

[Optional] Train Your Own Model

Sturdy Statistics integrates directly with Hacker News. Below we query the hackernews_comments integration for all comments that mention duckdb.

Training a model on our hacker news integration takes anywhere from 5-10 minutes. This step is optional and you can instead proceed with our public duckdb analysis index.

index = Index(id="index_d93b763e743a480a9f7e5f8f57c8c1b1")

# Uncomment the line below to create and train your own index
# index = Index(name="CN_Wave_Function_of_the_Universe") 

if index.get_status()["state"] == "untrained":
    index.ingestIntegration("cn_all", "https://doi.org/10.1103/PhysRevD.28.2960")
    index.train(dict(burn_in=1200, subdoc_hierarchy=False), fast=True, wait=False)
    print(job.get_status())
    # job.wait() # Sleeps until job finishes

Found an existing index with id="index_d93b763e743a480a9f7e5f8f57c8c1b1".

Exploration

In this section, we will demonstrate how to produce the two core visualization in their simplest form: the sunburst and the time trend plot.

index = Index(id="index_d93b763e743a480a9f7e5f8f57c8c1b1")

Found an existing index with id="index_d93b763e743a480a9f7e5f8f57c8c1b1".

Sunburst

Our bayesian probabilistic model learns a set of high level topics from your corpus. These topics are completely custom to your data, whether your dataset has hundreds of documents or billions. The model then maps this set of learned topics to single every word, sentence, paragraph, document, and group of documents to your dataset, providing a powerful semantic indexing.

This indexing enables us to store data in a granular, structured tabular format. This structured format enables rapid analysis to complex questions.

Topic Query

df = index.topicSearch()
df.head(5)[["short_title", "topic_group_short_title", "topic_id", "mentions", "prevalence"]]

	short_title	topic_group_short_title	topic_id	mentions	prevalence
0	Quantum Gravity Theories	Quantum Gravity Concepts	19	356.0	0.060771
1	Quantum Constraints and Invariance	Theoretical Frameworks	16	250.0	0.044115
2	Cosmological Perturbations	Cosmological Observations and Phenomena	35	243.0	0.040926
3	Anthropic Inflationary Landscape	Cosmology	58	190.0	0.033849
4	Scalar Field Cosmology	Cosmology	87	230.0	0.031586

Visualization

We can see there are two names: short_title and topic_group_short_title. The topic group is a high level thematic category while a topic is a much more granlular annotation.

A dataset can have hundreds of topics, but ussually only 20-50 topic groups. This hierarchy is extremly useful for organizing and exploring data in hierarchical formats such as sunbursts.

The inner circle of the sunburst is the title of the plot. The middle layer is the topic groups. And the leaf nodes are the topics that belong to the corresponding topic group. The size of each node is porportional to how often it shows up in the dataset.

df["title"] = "Wave Function of <br> the Universse Citations"
fig = px.sunburst(
    df,
    path=["title", "topic_group_short_title", "short_title"], 
    values="prevalence", hover_data={"topic_id": True}
)
procFig(fig, height=500).show()

Probablistic Search

SEARCH_QUERY = "religion"

Document Search

When you submit a search query, our indexing model maps your query to its thematic contents. Our index is a unified Bayesian probabilistic model and we use a statistically meaningful scoring metric called hellinger distance to score each candidate excerpt within your Index. Unlike cosine distance whose values are not well defined and can be used only to rank, the hellinger distance score defines the percentage of a document that ties directly to your theme.

This well defined score enables not only search ranking, but semantic search filter as well with the ability to define a hand-selected hard cutoff.

docdf = index.query(SEARCH_QUERY, semantic_search_cutoff=.5, limit=100)
displayText(docdf.iloc[[0, -1]], highlight=[SEARCH_QUERY, "god", "christian", "theology"])

Result 1/24

2014-12-12

Theism and Modern Physics: 100%

A Thomistic exploration of the unity of Truth in the science and religion dialogue: seeking oneness of the human experience

…

Result 24/24

None

Theism and Modern Physics: 57%

Quantum Gravity and Entropy: 25%

Emergent Semiclassical Time: 7%

Cyclic Cosmological Scenarios: 5%

Quantum Gravity Theories: 5%

Abstract. This paper focuses on four passages in the journey of the universe from beginning to end: its origin in the Big Bang, the production of heavy elements in first generation stars, the buzzing symphony of life on earth, and the distant future of the cosmos. As a physicist and a Christian theologian, I will ask how each of these passages casts light on the deepest questions of existence and our relation to God, and in turn how these questions are being explored through ongoing research into the interaction between Christian theology and the natural sciences.

Sunburst + Search

You will notice that accompanying each excerpt is a set of tags. These tags are the exacts some topics that we visualized in the sunburst. The topicSearch (and all other topic apis) are simple rollups over segments of data you defined.

The Sturdy Statistics API offers a unified interface for query thematic content as well. This leverages our vertically integrated thematic search. The search query you provide here matches on the exact same set of documents as the query above, but instead of providing the data, it provides a rollup on the thematic content of the data.

topic_df = index.topicSearch(SEARCH_QUERY, semantic_search_cutoff=.5).head(10)
topic_df["title"] = f"Citations about: {SEARCH_QUERY}"
fig = px.sunburst(topic_df, path=["title", "short_title"], values="mentions", hover_data={"topic_id": True})
fig = procFig(fig, height=500)
fig.show()

Digging into Semantic Topics

We can select any point above and pull out the actual excerpts that comprise it. Let’s say we are interested diving into the topic Quantum Gravity Theories (topic 19). We can easily query our index to pull out the matching excerpts.

This query is filtering on documents that match on religion as well as the topic Quantum Gravity Theories (topic 19).

topic_id = 19
row = topic_df.loc[topic_df.topic_id==topic_id]
row

	short_title	topic_id	mentions	prevalence	one_sentence_summary	executive_paragraph_summary	topic_group_id	topic_group_short_title	conc	entropy	title
1	Quantum Gravity Theories	19	5.0	0.049442	The documents collectively explore the various...	The ongoing pursuit of a comprehensive quantum...	2	Quantum Gravity Concepts	53.266777	6.642675	Citations about: religion

docdf = index.query(SEARCH_QUERY, topic_id=topic_id, semantic_search_cutoff=.5, limit=100)
print("Search:", SEARCH_QUERY, "Topic:", row.short_title.iloc[0])
displayText(docdf.iloc[[0, -1]], highlight=["duckdb", SEARCH_QUERY, "ahteism", "faith", "aquinas", "christian", "creation",])

## NB the number of excerpts lines up with the number of mentions
assert len(docdf) == row.mentions.iloc[0]

Search: religion Topic: Quantum Gravity Theories

Result 1/5

2015-03-25

Theism and Modern Physics: 83%

Quantum Gravity Theories: 11%

Quantum Constraints and Invariance: 2%

Support has been lent to contemporary ‘New Atheism’ from physicalist interpretations of ‘hard’ science. From this perspective, any system of knowledge that does not rely solely upon empirical method is deemed meaningless in comparison to observationally-grounded empirical science. Consequently, as a non-empirically-based approach, faith positions are included in the critique offered by physicalists. The impetus for this article, then, is to establish physicalism as a reductionist epistemology that is partially comprised of – seemingly inconspicuously embedded – metaphysical assumptions. With metaphysics apparent in ‘hard’ science, it is argued from a Thomist perspective that metaphysical themes of primary causality must be realistically considered to account for being. As a logical outcome, the proposal is made that metaphysical primary causality directs to the reasonable suggestion that God creates. Intradisciplinary and/or interdisciplinary implications: This article specifically challenges the currently trendy ‘New Atheist’ school of thought, resting upon the counter-argument offered that ‘hard’ science cannot ultimately account for the emergence or continued existence of being. Utilising Aquinas, the research calls for a re-embracement of unified, as opposed to limited, systems of knowledge.

…

Result 5/5

2025-03-03

Theism and Modern Physics: 58%

Quantum Gravity Theories: 29%

Emergence in Gravity: 3%

Quantum Constraints and Invariance: 3%

The paper examines the main components of the Christian view of creation in light of a scientific worldview. Although no scientific theory can refute or corroborate the doctrine of creation, the current state of natural sciences does have some impact on the formation of creationism. The author underlines this influence and depicts the importance of a rational and aspectual approach to the Christian idea of creation and the Creator–creation relationship. The concept of creation is also presented as having significant consequences for the expression of various elements of religious doctrine, including the place of man in the world.

Semantically Structured SQL

Sturdy Statistics embeds all of its semantic information into a tabular format. It directly exposes this tabular format through the queryMeta api.

In fact, all of our topic apis directly query the same tabular data structures that we expose in the queryMeta api.

SQL Integrated Semantic Search

Similar to our topic api, Sturdy Statistics integrates its semantic search directly into its sql api, enabling powerful sql analyses. For this analysis we will also change up our search query to explore new content.

df = index.queryMeta("""
SELECT 
    year(published::DATE) as year, 
    count(*) as publications
FROM doc 
GROUP BY year
ORDER BY year
""",
SEARCH_QUERY)

fig = px.line(df, 
"year", "publications", 
line_shape="hvh", 
title=f"Citations discussing {SEARCH_QUERY}"
)
fig = procFig(fig)
fig

SQL Topic Trends

Just as we were able to focus in on a specifc topic in our query, we can also query topics directly within sql.

Below, we query the number of paragraphs that mention the topic Quantum Gravity Theories (topic 19).

topic_id = 19
df = index.queryMeta(f"""
SELECT 
    year(published::DATE) as year, 
    sum((sparse_list_extract(
        {topic_id+1}, -- 1 indexed
        c_mean_avg_inds, 
        c_mean_avg_vals
    ) > 2.0)::INT) as publications 
FROM paragraph
GROUP BY year
ORDER BY year
""",
search_query=SEARCH_QUERY, semantic_search_cutoff=.5)
fig = px.bar(df, "year", "publications", title=f"Publications {SEARCH_QUERY} & Quantum Gravity Theories")
procFig(fig).show()

NB: our sql corresponds to our search docs

assert df.publications.sum() == len(docdf)

Even More Granular Extraction

Every high level visualization or rollup can be instantly tied back to the original data, no matter how granular or complex.

Let’s saw we want to pull out all hacker news comments that discuss pandas and mention the topic Complex SQL Queries that happened in 2025. That is a simple API call

docdf = index.query(SEARCH_QUERY, topic_id=topic_id, semantic_search_cutoff=.5, limit=100, filters="published > '2015-01-01'")
print("Search:", SEARCH_QUERY, "Topic:", row.short_title.iloc[0])
displayText(docdf.iloc[[0, -1]], highlight=["duckdb", SEARCH_QUERY, "faster", "complex", "sql", "aggregate", "relational", "api"])

Search: religion Topic: Quantum Gravity Theories

Result 1/3

2015-03-25

Theism and Modern Physics: 83%

Quantum Gravity Theories: 11%

Quantum Constraints and Invariance: 2%

Support has been lent to contemporary ‘New Atheism’ from physicalist interpretations of ‘hard’ science. From this perspective, any system of knowledge that does not rely solely upon empirical method is deemed meaningless in comparison to observationally-grounded empirical science. Consequently, as a non-empirically-based approach, faith positions are included in the critique offered by physicalists. The impetus for this article, then, is to establish physicalism as a reductionist epistemology that is partially comprised of – seemingly inconspicuously embedded – metaphysical assumptions. With metaphysics apparent in ‘hard’ science, it is argued from a Thomist perspective that metaphysical themes of primary causality must be realistically considered to account for being. As a logical outcome, the proposal is made that metaphysical primary causality directs to the reasonable suggestion that God creates. Intradisciplinary and/or interdisciplinary implications: This article specifically challenges the currently trendy ‘New Atheist’ school of thought, resting upon the counter-argument offered that ‘hard’ science cannot ultimately account for the emergence or continued existence of being. Utilising Aquinas, the research calls for a re-embracement of unified, as opposed to limited, systems of knowledge.

…

Result 3/3

2025-03-03

Theism and Modern Physics: 58%

Quantum Gravity Theories: 29%

Emergence in Gravity: 3%

Quantum Constraints and Invariance: 3%

The paper examines the main components of the Christian view of creation in light of a scientific worldview. Although no scientific theory can refute or corroborate the doctrine of creation, the current state of natural sciences does have some impact on the formation of creationism. The author underlines this influence and depicts the importance of a rational and aspectual approach to the Christian idea of creation and the Creator–creation relationship. The concept of creation is also presented as having significant consequences for the expression of various elements of religious doctrine, including the place of man in the world.

NB: our sql again corresponds to our search docs

assert df.loc[df["year"] >= 2015].publications.sum() == len(docdf)

Unlock Your Unstructured Data Today

from sturdystats import Index

index = Index("Custom Analysis")
index.upload(df.to_dict("records"))
index.commit()
index.train()

# Ready to Explore 
index.topicSearch()

More Examples

Prerequisites

[Optional] Train Your Own Model

Exploration

Sunburst

Topic Query

Visualization

Probablistic Search

Document Search

Result 1/24

2014-12-12

Result 24/24

None

Sunburst + Search

Digging into Semantic Topics

Result 1/5

2015-03-25

Result 5/5

2025-03-03

Semantically Structured SQL

SQL Integrated Semantic Search

SQL Topic Trends

NB: our sql corresponds to our search docs

Even More Granular Extraction

Result 1/3

2015-03-25

Result 3/3

2025-03-03

NB: our sql again corresponds to our search docs

Unlock Your Unstructured Data Today

More Examples

Changes in Novartis’ News Coverage

Transformer Architecture Structured Citations

How ArXiv Machine Learning Publications Have Changed This Decade

HackerNews’ Discussion on DuckDB vs Pandas

Kanye West News

Radford Neal’s Publications over Time

Nested Hierarchical Structuring of Tech Earnings Calls

Missing Links in Biomes Citation Network Analysis