在这个笔记本中,我们将使用一个Paul Graham的文章,将其分成片段,使用Azure OpenAI嵌入模型进行嵌入,将其加载到Azure AI搜索索引中,然后进行查询。
如果您在colab上打开这个笔记本,您可能需要安装LlamaIndex 🦙。
!pip install llama-index
!pip install wget
%pip install llama-index-vector-stores-azureaisearch
%pip install azure-search-documents==11.4.0
%llama-index-embeddings-azure-openai
%llama-index-llms-azure-openai
import logging
import sys
from azure.core.credentials import AzureKeyCredential
from azure.search.documents import SearchClient
from azure.search.documents.indexes import SearchIndexClient
from IPython.display import Markdown, display
from llama_index.core import (
SimpleDirectoryReader,
StorageContext,
VectorStoreIndex,
)
from llama_index.core.settings import Settings
from llama_index.llms.azure_openai import AzureOpenAI
from llama_index.embeddings.azure_openai import AzureOpenAIEmbedding
from llama_index.vector_stores.azureaisearch import AzureAISearchVectorStore
from llama_index.vector_stores.azureaisearch import (
IndexManagement,
MetadataIndexFieldType,
)
aoai_api_key = "YOUR_AZURE_OPENAI_API_KEY"
aoai_endpoint = "YOUR_AZURE_OPENAI_ENDPOINT"
aoai_api_version = "2023-05-15"
llm = AzureOpenAI(
model="YOUR_AZURE_OPENAI_COMPLETION_MODEL_NAME",
deployment_name="YOUR_AZURE_OPENAI_COMPLETION_DEPLOYMENT_NAME",
api_key=aoai_api_key,
azure_endpoint=aoai_endpoint,
api_version=aoai_api_version,
)
# 您需要部署自己的嵌入模型以及自己的聊天完成模型
embed_model = AzureOpenAIEmbedding(
model="YOUR_AZURE_OPENAI_EMBEDDING_MODEL_NAME",
deployment_name="YOUR_AZURE_OPENAI_EMBEDDING_DEPLOYMENT_NAME",
api_key=aoai_api_key,
azure_endpoint=aoai_endpoint,
api_version=aoai_api_version,
)
search_service_api_key = "YOUR-AZURE-SEARCH-SERVICE-ADMIN-KEY"
search_service_endpoint = "YOUR-AZURE-SEARCH-SERVICE-ENDPOINT"
search_service_api_version = "2023-11-01"
credential = AzureKeyCredential(search_service_api_key)
# 要使用的索引名称
index_name = "llamaindex-vector-demo"
# 使用索引客户端来演示创建索引
index_client = SearchIndexClient(
endpoint=search_service_endpoint,
credential=credential,
)
# 使用搜索客户端来演示使用现有索引
search_client = SearchClient(
endpoint=search_service_endpoint,
index_name=index_name,
credential=credential,
)
演示了如果不存在的话创建一个名为"llamaindex-vector-demo"的向量索引。该索引具有以下字段:
| 字段名 | OData类型 |
|----------|--------------------------|
| id | Edm.String
|
| chunk | Edm.String
|
| embedding| Collection(Edm.Single)
|
| metadata | Edm.String
|
| doc_id | Edm.String
|
| author | Edm.String
|
| theme | Edm.String
|
| director | Edm.String
|
metadata_fields = {
"author": "author",
"theme": ("topic", MetadataIndexFieldType.STRING),
"director": "director",
}
vector_store = AzureAISearchVectorStore(
search_or_index_client=index_client,
filterable_metadata_field_keys=metadata_fields,
index_name=index_name,
index_management=IndexManagement.CREATE_IF_NOT_EXISTS,
id_field_key="id",
chunk_field_key="chunk",
embedding_field_key="embedding",
embedding_dimensionality=1536,
metadata_string_field_key="metadata",
doc_id_field_key="doc_id",
language_analyzer="en.lucene",
vector_algorithm_type="exhaustiveKnn",
)
!mkdir -p 'data/paul_graham/'
!wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/paul_graham/paul_graham_essay.txt' -O 'data/paul_graham/paul_graham_essay.txt'
使用SimpleDirectoryReader加载存储在data/paul_graham/
中的文档。
# 加载文档
documents = SimpleDirectoryReader("../data/paul_graham/").load_data()
storage_context = StorageContext.from_defaults(vector_store=vector_store)
Settings.llm = llm
Settings.embed_model = embed_model
index = VectorStoreIndex.from_documents(
documents, storage_context=storage_context
)
# 查询数据
query_engine = index.as_query_engine(similarity_top_k=3)
response = query_engine.query("作者在成长过程中做了什么?")
display(Markdown(f"<b>{response}</b>"))
The author engaged in writing and programming activities during their formative years. They initially wrote short stories and later transitioned to programming on the IBM 1401 using an early version of Fortran. Subsequently, with the advent of microcomputers, the author began programming on a TRS-80, writing simple games, a rocket flight prediction program, and a word processor.
response = query_engine.query(
"What did the author learn?",
)
display(Markdown(f"<b>{response}</b>"))
The author learned that the study of philosophy in college did not live up to their expectations, as they found the courses to be boring and lacking in ultimate truths. This led them to switch their focus to AI, which was influenced by a novel featuring an intelligent computer and a PBS documentary showcasing advanced technology.
在 Pandas 中,我们可以使用已经存在的索引来创建新的数据框。这可以通过 reindex
方法来实现。让我们来看一个例子:
index_name = "llamaindex-vector-demo"
metadata_fields = {
"author": "author",
"theme": ("topic", MetadataIndexFieldType.STRING),
"director": "director",
}
vector_store = AzureAISearchVectorStore(
search_or_index_client=search_client,
filterable_metadata_field_keys=metadata_fields,
index_management=IndexManagement.VALIDATE_INDEX,
id_field_key="id",
chunk_field_key="chunk",
embedding_field_key="embedding",
embedding_dimensionality=1536,
metadata_string_field_key="metadata",
doc_id_field_key="doc_id",
)
storage_context = StorageContext.from_defaults(vector_store=vector_store)
index = VectorStoreIndex.from_documents(
[],
storage_context=storage_context,
)
query_engine = index.as_query_engine()
response = query_engine.query("What was a hard moment for the author?")
display(Markdown(f"<b>{response}</b>"))
The author faced a challenging moment when he couldn't figure out what to do with the early computer he had access to in 9th grade. This was due to the limited options for input and the lack of knowledge in math to do anything interesting with the available resources.
response = query_engine.query("Who is the author?")
display(Markdown(f"<b>{response}</b>"))
Paul Graham
import time
query_engine = index.as_query_engine(streaming=True)
response = query_engine.query("What happened at interleaf?")
start_time = time.time()
token_count = 0
for token in response.response_gen:
print(token, end="")
token_count += 1
time_elapsed = time.time() - start_time
tokens_per_second = token_count / time_elapsed
print(f"\n\nStreamed output at {tokens_per_second} tokens/s")
The author worked at Interleaf, where they learned several lessons, including the importance of product-focused leadership in technology companies, the drawbacks of code being edited by too many people, the limitations of conventional office hours for optimal hacking, and the risks associated with bureaucratic customers. Additionally, the author discovered the concept that the low end tends to dominate the high end, and that being the "entry level" option can be advantageous. Streamed output at 99.40073103089465 tokens/s
要将文档添加到现有的索引中,可以使用以下代码示例:
from elasticsearch import Elasticsearch
# 连接到Elasticsearch实例
es = Elasticsearch([{'host': 'localhost', 'port': 9200}])
# 索引名称
index_name = "your_index_name"
# 文档数据
document = {
"title": "Sample document",
"content": "This is a sample document for indexing."
}
# 将文档添加到索引
es.index(index=index_name, body=document)
在这个示例中,我们首先连接到Elasticsearch实例,然后指定要添加文档的索引名称和文档数据。最后,我们使用es.index
方法将文档添加到指定的索引中。
response = query_engine.query("What colour is the sky?")
display(Markdown(f"<b>{response}</b>"))
Blue
from llama_index.core import Document
index.insert_nodes([Document(text="The sky is indigo today")])
response = query_engine.query("What colour is the sky?")
display(Markdown(f"<b>{response}</b>"))
The sky is indigo today.
from llama_index.core.schema import TextNode
nodes = [
TextNode(
text="The Shawshank Redemption",
metadata={
"author": "Stephen King",
"theme": "Friendship",
},
),
TextNode(
text="The Godfather",
metadata={
"director": "Francis Ford Coppola",
"theme": "Mafia",
},
),
TextNode(
text="Inception",
metadata={
"director": "Christopher Nolan",
},
),
]
index.insert_nodes(nodes)
from llama_index.core.vector_stores.types import (
MetadataFilters,
ExactMatchFilter,
)
filters = MetadataFilters(
filters=[ExactMatchFilter(key="theme", value="Mafia")]
)
retriever = index.as_retriever(filters=filters)
retriever.retrieve("What is inception about?")
[NodeWithScore(node=TextNode(id_='049f00de-13be-4af3-ab56-8c16352fe799', embedding=None, metadata={'director': 'Francis Ford Coppola', 'theme': 'Mafia'}, excluded_embed_metadata_keys=[], excluded_llm_metadata_keys=[], relationships={}, hash='ad2a08d4364262546db9711b915348d43e0ccc41bd8c3c41775e133624e1fa1b', text='The Godfather', start_char_idx=None, end_char_idx=None, text_template='{metadata_str}\n\n{content}', metadata_template='{key}: {value}', metadata_seperator='\n'), score=0.8120511)]
支持四种查询模式:DEFAULT(向量搜索)、SPARSE(稀疏)、HYBRID(混合)和SEMANTIC_HYBRID(语义混合)。
from llama_index.core.vector_stores.types import VectorStoreQueryMode
# 默认的检索器
default_retriever = index.as_retriever(
vector_store_query_mode=VectorStoreQueryMode.DEFAULT
)
response = default_retriever.retrieve("Inception是关于什么的?")
# 遍历响应中的每个NodeWithScore
for node_with_score in response:
node = node_with_score.node # TextNode对象
score = node_with_score.score # 相似度分数
chunk_id = node.id_ # 块ID
# 从节点中提取相关的元数据
file_name = node.metadata.get("file_name", "未知")
file_path = node.metadata.get("file_path", "未知")
# 从节点中提取文本内容
text_content = node.text if node.text else "无可用内容"
# 以用户友好的格式打印结果
print(f"分数: {score}")
print(f"文件名: {file_name}")
print(f"ID: {chunk_id}")
print("\n提取的内容:")
print(text_content)
print("\n" + "=" * 40 + " 结果结束 " + "=" * 40 + "\n")
Score: 0.8748552 File Name: Unknown Id: bae0df75-ff37-4725-b659-b9fd8bf2ef3c Extracted Content: Inception ======================================== End of Result ======================================== Score: 0.8155207 File Name: paul_graham_essay.txt Id: ae5aee85-a083-4141-bf75-bbb872f53760 Extracted Content: It's not that unprestigious types of work are good per se. But when you find yourself drawn to some kind of work despite its current lack of prestige, it's a sign both that there's something real to be discovered there, and that you have the right kind of motives. Impure motives are a big danger for the ambitious. If anything is going to lead you astray, it will be the desire to impress people. So while working on things that aren't prestigious doesn't guarantee you're on the right track, it at least guarantees you're not on the most common type of wrong one. Over the next several years I wrote lots of essays about all kinds of different topics. O'Reilly reprinted a collection of them as a book, called Hackers & Painters after one of the essays in it. I also worked on spam filters, and did some more painting. I used to have dinners for a group of friends every thursday night, which taught me how to cook for groups. And I bought another building in Cambridge, a former candy factory (and later, twas said, porn studio), to use as an office. One night in October 2003 there was a big party at my house. It was a clever idea of my friend Maria Daniels, who was one of the thursday diners. Three separate hosts would all invite their friends to one party. So for every guest, two thirds of the other guests would be people they didn't know but would probably like. One of the guests was someone I didn't know but would turn out to like a lot: a woman called Jessica Livingston. A couple days later I asked her out. Jessica was in charge of marketing at a Boston investment bank. This bank thought it understood startups, but over the next year, as she met friends of mine from the startup world, she was surprised how different reality was. And how colorful their stories were. So she decided to compile a book of interviews with startup founders. When the bank had financial problems and she had to fire half her staff, she started looking for a new job. In early 2005 she interviewed for a marketing job at a Boston VC firm. It took them weeks to make up their minds, and during this time I started telling her about all the things that needed to be fixed about venture capital. They should make a larger number of smaller investments instead of a handful of giant ones, they should be funding younger, more technical founders instead of MBAs, they should let the founders remain as CEO, and so on. One of my tricks for writing essays had always been to give talks. The prospect of having to stand up in front of a group of people and tell them something that won't waste their time is a great spur to the imagination. When the Harvard Computer Society, the undergrad computer club, asked me to give a talk, I decided I would tell them how to start a startup. Maybe they'd be able to avoid the worst of the mistakes we'd made. So I gave this talk, in the course of which I told them that the best sources of seed funding were successful startup founders, because then they'd be sources of advice too. Whereupon it seemed they were all looking expectantly at me. Horrified at the prospect of having my inbox flooded by business plans (if I'd only known), I blurted out "But not me!" and went on with the talk. But afterward it occurred to me that I should really stop procrastinating about angel investing. I'd been meaning to since Yahoo bought us, and now it was 7 years later and I still hadn't done one angel investment. Meanwhile I had been scheming with Robert and Trevor about projects we could work on together. I missed working with them, and it seemed like there had to be something we could collaborate on. As Jessica and I were walking home from dinner on March 11, at the corner of Garden and Walker streets, these three threads converged. Screw the VCs who were taking so long to make up their minds. We'd start our own investment firm and actually implement the ideas we'd been talking about. I'd fund it, and Jessica could quit her job and work for it, and we'd get Robert and Trevor as partners too. [13] Once again, ignorance worked in our favor. We had no idea how to be angel investors, and in Boston in 2005 there were no Ron Conways to learn from. So we just made what seemed like the obvious choices, and some of the things we did turned out to be novel. There are multiple components to Y Combinator, and we didn't figure them all out at once. The part we got first was to be an angel firm. ======================================== End of Result ========================================
from llama_index.core.vector_stores.types import VectorStoreQueryMode
hybrid_retriever = index.as_retriever(
vector_store_query_mode=VectorStoreQueryMode.HYBRID
)
hybrid_retriever.retrieve("What is inception about?")
[NodeWithScore(node=TextNode(id_='bae0df75-ff37-4725-b659-b9fd8bf2ef3c', embedding=None, metadata={'director': 'Christopher Nolan'}, excluded_embed_metadata_keys=[], excluded_llm_metadata_keys=[], relationships={}, hash='9792a1fd7d2e1a08f1b1d70a597357bb6b68d69ed5685117eaa37ac9e9a3565e', text='Inception', start_char_idx=None, end_char_idx=None, text_template='{metadata_str}\n\n{content}', metadata_template='{key}: {value}', metadata_seperator='\n'), score=0.03181818127632141), NodeWithScore(node=TextNode(id_='ae5aee85-a083-4141-bf75-bbb872f53760', embedding=None, metadata={'file_path': '..\\data\\paul_graham\\paul_graham_essay.txt', 'file_name': 'paul_graham_essay.txt', 'file_type': 'text/plain', 'file_size': 75395, 'creation_date': '2023-12-12', 'last_modified_date': '2023-12-12', 'last_accessed_date': '2024-02-02'}, excluded_embed_metadata_keys=['file_name', 'file_type', 'file_size', 'creation_date', 'last_modified_date', 'last_accessed_date'], excluded_llm_metadata_keys=['file_name', 'file_type', 'file_size', 'creation_date', 'last_modified_date', 'last_accessed_date'], relationships={<NodeRelationship.SOURCE: '1'>: RelatedNodeInfo(node_id='627552ee-116a-4132-a7d3-7e7232f75866', node_type=<ObjectType.DOCUMENT: '4'>, metadata={'file_path': '..\\data\\paul_graham\\paul_graham_essay.txt', 'file_name': 'paul_graham_essay.txt', 'file_type': 'text/plain', 'file_size': 75395, 'creation_date': '2023-12-12', 'last_modified_date': '2023-12-12', 'last_accessed_date': '2024-02-02'}, hash='0a59e1ce8e50a67680a5669164f79e524087270ce183a3971fcd18ac4cad1fa0'), <NodeRelationship.PREVIOUS: '2'>: RelatedNodeInfo(node_id='24a1d375-31e3-492c-ac02-5091e3572e3f', node_type=<ObjectType.TEXT: '1'>, metadata={'file_path': '..\\data\\paul_graham\\paul_graham_essay.txt', 'file_name': 'paul_graham_essay.txt', 'file_type': 'text/plain', 'file_size': 75395, 'creation_date': '2023-12-12', 'last_modified_date': '2023-12-12', 'last_accessed_date': '2024-02-02'}, hash='51c474a12ac8e9748258b2c7bbe77bb7c8bf35b775ed44f016057a0aa8b0bd76'), <NodeRelationship.NEXT: '3'>: RelatedNodeInfo(node_id='196569e0-2b10-4ba3-8263-a69fb78dd98c', node_type=<ObjectType.TEXT: '1'>, metadata={}, hash='192082e7ba84b8c5e2a64bd1d422c6c503189fc3ba325bb3e6e8bdb43db03fbb')}, hash='a3ea638857f1daadf7af967322480f97e1235dac3ee7d72b8024670785df8810', text='It\'s not that unprestigious types of work are good per se. But when you find yourself drawn to some kind of work despite its current lack of prestige, it\'s a sign both that there\'s something real to be discovered there, and that you have the right kind of motives. Impure motives are a big danger for the ambitious. If anything is going to lead you astray, it will be the desire to impress people. So while working on things that aren\'t prestigious doesn\'t guarantee you\'re on the right track, it at least guarantees you\'re not on the most common type of wrong one.\n\nOver the next several years I wrote lots of essays about all kinds of different topics. O\'Reilly reprinted a collection of them as a book, called Hackers & Painters after one of the essays in it. I also worked on spam filters, and did some more painting. I used to have dinners for a group of friends every thursday night, which taught me how to cook for groups. And I bought another building in Cambridge, a former candy factory (and later, twas said, porn studio), to use as an office.\n\nOne night in October 2003 there was a big party at my house. It was a clever idea of my friend Maria Daniels, who was one of the thursday diners. Three separate hosts would all invite their friends to one party. So for every guest, two thirds of the other guests would be people they didn\'t know but would probably like. One of the guests was someone I didn\'t know but would turn out to like a lot: a woman called Jessica Livingston. A couple days later I asked her out.\n\nJessica was in charge of marketing at a Boston investment bank. This bank thought it understood startups, but over the next year, as she met friends of mine from the startup world, she was surprised how different reality was. And how colorful their stories were. So she decided to compile a book of interviews with startup founders.\n\nWhen the bank had financial problems and she had to fire half her staff, she started looking for a new job. In early 2005 she interviewed for a marketing job at a Boston VC firm. It took them weeks to make up their minds, and during this time I started telling her about all the things that needed to be fixed about venture capital. They should make a larger number of smaller investments instead of a handful of giant ones, they should be funding younger, more technical founders instead of MBAs, they should let the founders remain as CEO, and so on.\n\nOne of my tricks for writing essays had always been to give talks. The prospect of having to stand up in front of a group of people and tell them something that won\'t waste their time is a great spur to the imagination. When the Harvard Computer Society, the undergrad computer club, asked me to give a talk, I decided I would tell them how to start a startup. Maybe they\'d be able to avoid the worst of the mistakes we\'d made.\n\nSo I gave this talk, in the course of which I told them that the best sources of seed funding were successful startup founders, because then they\'d be sources of advice too. Whereupon it seemed they were all looking expectantly at me. Horrified at the prospect of having my inbox flooded by business plans (if I\'d only known), I blurted out "But not me!" and went on with the talk. But afterward it occurred to me that I should really stop procrastinating about angel investing. I\'d been meaning to since Yahoo bought us, and now it was 7 years later and I still hadn\'t done one angel investment.\n\nMeanwhile I had been scheming with Robert and Trevor about projects we could work on together. I missed working with them, and it seemed like there had to be something we could collaborate on.\n\nAs Jessica and I were walking home from dinner on March 11, at the corner of Garden and Walker streets, these three threads converged. Screw the VCs who were taking so long to make up their minds. We\'d start our own investment firm and actually implement the ideas we\'d been talking about. I\'d fund it, and Jessica could quit her job and work for it, and we\'d get Robert and Trevor as partners too. [13]\n\nOnce again, ignorance worked in our favor. We had no idea how to be angel investors, and in Boston in 2005 there were no Ron Conways to learn from. So we just made what seemed like the obvious choices, and some of the things we did turned out to be novel.\n\nThere are multiple components to Y Combinator, and we didn\'t figure them all out at once. The part we got first was to be an angel firm.', start_char_idx=45670, end_char_idx=50105, text_template='{metadata_str}\n\n{content}', metadata_template='{key}: {value}', metadata_seperator='\n'), score=0.03009207174181938)]
这种模式将语义重新排序应用于混合搜索结果,以提高搜索相关性。
请查看此链接以获取更多详细信息:https://learn.microsoft.com/azure/search/semantic-search-overview
hybrid_retriever = index.as_retriever(
vector_store_query_mode=VectorStoreQueryMode.SEMANTIC_HYBRID
)
hybrid_retriever.retrieve("What is inception about?")
[NodeWithScore(node=TextNode(id_='bae0df75-ff37-4725-b659-b9fd8bf2ef3c', embedding=None, metadata={'director': 'Christopher Nolan'}, excluded_embed_metadata_keys=[], excluded_llm_metadata_keys=[], relationships={}, hash='9792a1fd7d2e1a08f1b1d70a597357bb6b68d69ed5685117eaa37ac9e9a3565e', text='Inception', start_char_idx=None, end_char_idx=None, text_template='{metadata_str}\n\n{content}', metadata_template='{key}: {value}', metadata_seperator='\n'), score=2.3949906826019287), NodeWithScore(node=TextNode(id_='fc9782a2-c255-4265-a618-3a864abe598d', embedding=None, metadata={'file_path': '..\\data\\paul_graham\\paul_graham_essay.txt', 'file_name': 'paul_graham_essay.txt', 'file_type': 'text/plain', 'file_size': 75395, 'creation_date': '2023-12-12', 'last_modified_date': '2023-12-12', 'last_accessed_date': '2024-02-02'}, excluded_embed_metadata_keys=['file_name', 'file_type', 'file_size', 'creation_date', 'last_modified_date', 'last_accessed_date'], excluded_llm_metadata_keys=['file_name', 'file_type', 'file_size', 'creation_date', 'last_modified_date', 'last_accessed_date'], relationships={<NodeRelationship.SOURCE: '1'>: RelatedNodeInfo(node_id='627552ee-116a-4132-a7d3-7e7232f75866', node_type=<ObjectType.DOCUMENT: '4'>, metadata={'file_path': '..\\data\\paul_graham\\paul_graham_essay.txt', 'file_name': 'paul_graham_essay.txt', 'file_type': 'text/plain', 'file_size': 75395, 'creation_date': '2023-12-12', 'last_modified_date': '2023-12-12', 'last_accessed_date': '2024-02-02'}, hash='0a59e1ce8e50a67680a5669164f79e524087270ce183a3971fcd18ac4cad1fa0'), <NodeRelationship.PREVIOUS: '2'>: RelatedNodeInfo(node_id='94d87013-ea3d-4a9c-982a-dde5ff219983', node_type=<ObjectType.TEXT: '1'>, metadata={'file_path': '..\\data\\paul_graham\\paul_graham_essay.txt', 'file_name': 'paul_graham_essay.txt', 'file_type': 'text/plain', 'file_size': 75395, 'creation_date': '2023-12-12', 'last_modified_date': '2023-12-12', 'last_accessed_date': '2024-02-02'}, hash='f28897170c6b61162069af9ee83dc11e13fa0f6bf6efaa7b3911e6ad9093da84'), <NodeRelationship.NEXT: '3'>: RelatedNodeInfo(node_id='dc3852e5-4c1e-484e-9e65-f17084d3f7b4', node_type=<ObjectType.TEXT: '1'>, metadata={}, hash='deaee6d5c992dbf757876957aa9112a42d30a636c6c83d81fcfac4aaf2d24dee')}, hash='a3b31e5ec2b5d4a9b3648de310c8a5962c17afdb800ea0e16faa47956607866d', text='And at the same time all involved would adhere outwardly to the conventions of a 19th century atelier. We actually had one of those little stoves, fed with kindling, that you see in 19th century studio paintings, and a nude model sitting as close to it as possible without getting burned. Except hardly anyone else painted her besides me. The rest of the students spent their time chatting or occasionally trying to imitate things they\'d seen in American art magazines.\n\nOur model turned out to live just down the street from me. She made a living from a combination of modelling and making fakes for a local antique dealer. She\'d copy an obscure old painting out of a book, and then he\'d take the copy and maltreat it to make it look old. [3]\n\nWhile I was a student at the Accademia I started painting still lives in my bedroom at night. These paintings were tiny, because the room was, and because I painted them on leftover scraps of canvas, which was all I could afford at the time. Painting still lives is different from painting people, because the subject, as its name suggests, can\'t move. People can\'t sit for more than about 15 minutes at a time, and when they do they don\'t sit very still. So the traditional m.o. for painting people is to know how to paint a generic person, which you then modify to match the specific person you\'re painting. Whereas a still life you can, if you want, copy pixel by pixel from what you\'re seeing. You don\'t want to stop there, of course, or you get merely photographic accuracy, and what makes a still life interesting is that it\'s been through a head. You want to emphasize the visual cues that tell you, for example, that the reason the color changes suddenly at a certain point is that it\'s the edge of an object. By subtly emphasizing such things you can make paintings that are more realistic than photographs not just in some metaphorical sense, but in the strict information-theoretic sense. [4]\n\nI liked painting still lives because I was curious about what I was seeing. In everyday life, we aren\'t consciously aware of much we\'re seeing. Most visual perception is handled by low-level processes that merely tell your brain "that\'s a water droplet" without telling you details like where the lightest and darkest points are, or "that\'s a bush" without telling you the shape and position of every leaf. This is a feature of brains, not a bug. In everyday life it would be distracting to notice every leaf on every bush. But when you have to paint something, you have to look more closely, and when you do there\'s a lot to see. You can still be noticing new things after days of trying to paint something people usually take for granted, just as you can after days of trying to write an essay about something people usually take for granted.\n\nThis is not the only way to paint. I\'m not 100% sure it\'s even a good way to paint. But it seemed a good enough bet to be worth trying.\n\nOur teacher, professor Ulivi, was a nice guy. He could see I worked hard, and gave me a good grade, which he wrote down in a sort of passport each student had. But the Accademia wasn\'t teaching me anything except Italian, and my money was running out, so at the end of the first year I went back to the US.\n\nI wanted to go back to RISD, but I was now broke and RISD was very expensive, so I decided to get a job for a year and then return to RISD the next fall. I got one at a company called Interleaf, which made software for creating documents. You mean like Microsoft Word? Exactly. That was how I learned that low end software tends to eat high end software. But Interleaf still had a few years to live yet. [5]\n\nInterleaf had done something pretty bold. Inspired by Emacs, they\'d added a scripting language, and even made the scripting language a dialect of Lisp. Now they wanted a Lisp hacker to write things in it. This was the closest thing I\'ve had to a normal job, and I hereby apologize to my boss and coworkers, because I was a bad employee. Their Lisp was the thinnest icing on a giant C cake, and since I didn\'t know C and didn\'t want to learn it, I never understood most of the software. Plus I was terribly irresponsible. This was back when a programming job meant showing up every day during certain working hours.', start_char_idx=14179, end_char_idx=18443, text_template='{metadata_str}\n\n{content}', metadata_template='{key}: {value}', metadata_seperator='\n'), score=1.0986518859863281)]