Search Papers

get_papers_count()

Number of paper for a given query

make_search_url()

Make a search URL for OnePetro

run_papers_search()

Run a papers search providing multiple keywords and optionally save results.

onepetro_page_to_dataframe()

Reads a OnePetro URL and converts it to a dataframe

read_multipage()

Reads metadata in groups of 1000 papers

join_keywords()

Get paper count and paper dataframe by joining keywords as vectors

read_multidoc()

Read all OnePetro papers metadata by type of document

read_onepetro()

Read OnePetro web page given a query URL

Papers Summary

papers_by_type()

Summary by document type

papers_by_year()

Papers by Year

papers_by_publication()

Papers by publication

papers_by_publisher()

Papers by publisher

summary_by_dates()

Summary by year

summary_by_doctype()

Summary by document type

summary_by_publications()

Summary by publication

summary_by_publisher()

Summary by publisher

Data Manipulation

remove_duplicates_by()

Remove duplicate papers by a variable

generate_offline_data()

Generate data for offline testing Mockup test data

Data Mining

get_term_document_matrix()

A TermDocumentMatrix corpus objects

get_top_term_papers()

Get papers for top "N" terms

plot_cluster_dendrogram()

Plot a dendrogram

plot_bars()

Plot frequency distribution with horizontal bara

plot_relationships()

Plot a relationship diagram with weights

plot_wordcloud()

Plot a word cloud

term_frequency_n_grams()

Find the frequency for two or more words together

term_frequency()

Word Frequency Dataframe

datasets

custom_stopwords

Default custom stop words

discipline_labels

Discipline and Subject labels dataset

Miscelaneous

petro.One-package

Text mining and statistics for OnePetro papers petro.One

use_example()

Unpack an example