Smart EDGAR File API

So far we have seen how to use the database interface of Smart EDGAR. In this document I give a quick overview of the core functionality of the File based API which does not require any DBMS.

As a precondition we expect that you have executed the download of the files from EDGAR.

Setup

We install the Smart EDGAR library with the help of Maven. We also install the Jupyter-jdk extensions so that we can render ITable objects as BeakerX tables.

In [28]:
%classpath config resolver maven-public http://software.pschatzmann.ch/repository/maven-public/
%%classpath add mvn 
ch.pschatzmann:smart-edgar:LATEST
ch.pschatzmann:jupyter-jdk-extensions:LATEST
In [29]:
import ch.pschatzmann.display._
import ch.pschatzmann.edgar.base._
import ch.pschatzmann.edgar.reporting.company._
import ch.pschatzmann.edgar.dataload.rss.RSSDataSource
import ch.pschatzmann.edgar.utils.Utils
import ch.pschatzmann.edgar.base.Fact._

Displayers.setup
Out[29]:
true

Companies

We can access the information starting from the EdgarCompany.list which provides the list of all companies

In [30]:
%%time
import scala.collection.JavaConverters._

val companies = EdgarCompany.stream.iterator.asScala.slice(0,10).toSeq
CPU times: user 0 ns, sys: 219 µs, total: 219 µs 
Wall Time: 410 ms

Out[30]:
[[0001368622, 0001286181, 0001577898, 0000886136, 0000886137, 0001505155, 0000904918, 0001674335, 0001474042, 0001276262]]
In [31]:
val company = companies(0)
Out[31]:
0001368622
In [32]:
company.getCompanyName
Out[32]:
AeroVironment Inc
In [33]:
company.getSICDescription
Out[33]:
AIRCRAFT
In [34]:
company.getStateLocation
Out[34]:
CA
In [35]:
company.getStateIncorporation
Out[35]:
DE
In [36]:
val filings = company.getFilings
Out[36]:
[1368622-10-K-20120626, 1368622-10-K-20130625, 1368622-10-K-20130626, 1368622-10-K-20140708, 1368622-10-K-20150630, 1368622-10-K-20160628, 1368622-10-K-20160629, 1368622-10-K-20170627, 1368622-10-K-20170628, 1368622-10-K-20180627, 1368622-10-Q-20110908, 1368622-10-Q-20111206, 1368622-10-Q-20120306, 1368622-10-Q-20120307, 1368622-10-Q-20120905, 1368622-10-Q-20120906, 1368622-10-Q-20121205, 1368622-10-Q-20130305, 1368622-10-Q-20130827, 1368622-10-Q-20131126, 1368622-10-Q-20131127, 1368622-10-Q-20140305, 1368622-10-Q-20140903, 1368622-10-Q-20141125, 1368622-10-Q-20141126, 1368622-10-Q-20150303, 1368622-10-Q-20150901, 1368622-10-Q-20151208, 1368622-10-Q-20160308, 1368622-10-Q-20160830, 1368622-10-Q-20160831, 1368622-10-Q-20161207, 1368622-10-Q-20170307, 1368622-10-Q-20170308, 1368622-10-Q-20170829, 1368622-10-Q-20170830, 1368622-10-Q-20171205, 1368622-10-Q-20171206, 1368622-10-Q-20180306, 1368622-10-Q-20180307, 1368622-10-Q-20180905, 1368622-10-Q-20180906, 1368622-10-Q-20181129, 1368622-10-Q-20181130]

XBRL

We can combine multiple filings into one XBRL data access object. The selection can be done with the help of a regex expression

In [37]:
%%time
val xbrl = company.getXBRL(".*10-K.*")
CPU times: user 0 ns, sys: 229 µs, total: 229 µs 
Wall Time: 12 s

Out[37]:
ch.pschatzmann.edgar.base.XBRL@64ba18cf

...or we just use all files

In [38]:
%%time
val xbrlAll = company.getXBRL()
CPU times: user 0 ns, sys: 234 µs, total: 234 µs 
Wall Time: 63 s

Out[38]:
ch.pschatzmann.edgar.base.XBRL@7efaf2d8

We are automatically indexing by all attribute values. Thus we can use the findValues method to search in that index. In order to display the data in BeakerX as Table we convert the data to a Scala collection of Maps

In [39]:
val cogs = xbrl.findValues("Cost of Goods Sold")

new TableDisplay(cogs.asScala.map(v => v.getAttributes.asScala.toMap))
In [40]:
val values = xbrl.findValues()

new TableDisplay(values.asScala.map(v => v.getAttributes.asScala.toMap))

We should filter all values which are not relevant for our purpose. E.g.

In [41]:
val values = xbrl.findValues().asScala
  .filter(v => v.getContext.getSegments.isEmpty )
  .filter(v => v.getDataType == DataType.number)
  .filter(v => !v.getValue.isEmpty)
  
new TableDisplay(values.map(v => v.getAttributes.asScala.toMap))
In [42]:
val labelAPI = xbrl.getLabelAPI()
labelAPI.getLabel("CostOfGoodsSold").getLabel
Out[42]:
Cost of Goods Sold

Usually we want to access the numerical information. However we also provide the consolidated text that we can use to feed some NLP functionality with the getCombinedTextValues method:

In [43]:
val values = xbrl.getCombinedTextValues

new TableDisplay(values.asScala.map(v => v.getAttributes.asScala.toMap))

Additional Output

We can also convert all the values to CSV by calling the toValueCSV method

In [44]:
Utils.setCSVDelimiter(",")
val file = Utils.createTempFile(xbrl.toValueCSV)
val table = new TableDisplay(new CSV().readFile(file.getAbsolutePath))

We can also convert the (numerical) data to an ITable object

In [45]:
xbrl.toTable

CompanyEdgarValues

From the EdgarCompany object we can also access the CompanyEdgarValues class which supports the calculation of KPIs. However it is much more efficient to use the corresponding Database functionality

In [46]:
val values = company.getCompanyEdgarValues
    .setUseArrayList(true)
    .setAddTime(true)
    .setFilter(new FilterYearly())
    .setParameterNames("NetIncomeLoss","OperatingIncomeLoss","ResearchAndDevelopmentExpense",
        "CashAndCashEquivalentsAtCarryingValue","AvailableForSaleSecuritiesCurrent","AccountsReceivableNetCurrent",
        "Revenues","SalesRevenueNet","InventoryNet","AssetsCurrent","LiabilitiesCurrent","Assets","EarningsPerShareBasic",
        "StockholdersEquity")
    .addFormula("Revenue","Edgar.coalesce('Revenues', 'SalesRevenueNet')")
    .addFormula("QuickRatio","(CashAndCashEquivalentsAtCarryingValue + AccountsReceivableNetCurrent + AvailableForSaleSecuritiesCurrent) / LiabilitiesCurrent")
    .addFormula("CurrentRatio","AssetsCurrent / LiabilitiesCurrent")
    .addFormula("InventoryTurnover","Revenue / InventoryNet")
    .addFormula("NetProfitMargin","NetIncomeLoss / Revenue")
    .addFormula("SalesResearchRatio%","ResearchAndDevelopmentExpense / Revenue *100")
    .addFormula("NetIncomeResearchRatio%","ResearchAndDevelopmentExpense / NetIncomeLoss * 100")
    .addFormula("NetIncomeChange%", "NetIncomeLoss - Edgar.lag('NetIncomeLoss', -1) / Edgar.lag('NetIncomeLoss', -1) * 100 ")  
    .addFormula("RevenueChange%", "Edgar.percentChange('Revenue')" )  
    .addFormula("ResearchAndDevelopmentChange%","Edgar.percentChange('ResearchAndDevelopmentExpense')" )
    .removeParameterNames("Revenues","SalesRevenueNet")

val list = values.getTable

EdgarFiling

Instead of accessing the information by company we can request all filings with the EdgarFiling.list() method

In [47]:
%%time
var filings = EdgarFiling.list(".*10-K.*")

filings.size
CPU times: user 0 ns, sys: 212 µs, total: 212 µs 
Wall Time: 3 s

Out[47]:
53402
In [48]:
var filing = filings.get(0)
Out[48]:
1750-10-K-20110713
In [49]:
var companyName = filing.getCompanyInfo.getCompanyName
Out[49]:
AAR CORP
In [50]:
var xbrl = filing.getXBRL

xbrl.toTable

Downloading Data

The data can be downloaded from EDGAR with the help of the RSSDataSource: If the history flag is set to false we just download the most recent docouments from https://www.sec.gov/Archives/edgar/usgaap.rss.xml. If the histry flag is set to true we download all available data back to 2005-04

In [51]:
import scala.collection.JavaConverters._

var downloadData = new RSSDataSource().getData(false, "10-K.*").asScala
downloadData.foreach(d => d.download())
Out[51]:
null

Last but not least, we can load an xbrl directly from the EDGAR database via the Internet. The getXBRL method on the FeedInfoRecord is parsing the local XBRL file if it exists - otherwise the information is downloaded from the URL indicated in the FeedInfoRecord.

In [52]:
val first = downloadData.toSeq(0)
val xbrl = first.getXBRL
val values = xbrl.findValues().asScala

new TableDisplay(values.map(v => v.getAttributes.asScala.toMap))
In [53]:
first.getUriXbrl()
Out[53]:
https://www.sec.gov/Archives/edgar/data/1000230/000143774918022311/0001437749-18-022311-xbrl.zip
In [54]:
first.getUrlHttp()
Out[54]:
https://www.sec.gov/Archives/edgar/data/1000230/000143774918022311/0001437749-18-022311-index.htm
In [ ]: