[PDF] Package rcorpora 17 July 2018 animals/dog_names

Previous PDF Next PDF


learners and by NS controls: band 1 (the 1000 most frequent words in the. Table 3. French and Italian lexical profiles (%) using LOPPf. Group. B1. B2. B3. OL.

1000 Common Italian Words with English Translation 1000 Common Italian Words with English Translation

top 1000 most common Italian words vino wine porta door sud south sogno dream cane dog isola island movimento movement mente mind occasione opportunity.

verbal comprehension in aphasia: an english version of three italian

taken from the 1000 most common words of the Thorndike-Lorge count. (Thorndike and Lorge 1944); the remaining ten had a frequency of between 11 and 45 per 

1000 most common spanish words pdf 1000 most common spanish words pdf

6 Minute Quiz 6 Min TRIVIA EASY Can You Tell Italian Words from Spanish? 5 Minute Quiz 5 Min TRIVIA Can You Complete These Common Spanish Phrases? It 


(often used with: then after

Top 1000 most common spanish words

Top 1000 most common spanish words. This is a list of the 1000 most Also

Go Natural English

16 Oct 2018 1000 Most Common Words in English. Building your vocabulary with some ... traditional – “They serve traditional Italian food there.” 929. onto ...

Global Study on Homicide 2019

Shooting has long been the most common cause of death in homicide cases worldwide. Source: Eisner M.


2 June 2020 The list of words contained in this dictionary includes the most common words that students use in everyday communication ... 1000 kilograms = 1 ...

LEARN… the 100 most common words in FRENCH GERMAN

the 100 most common words in FRENCH GERMAN


second language (L2) learners' French and Italian. It discusses how belonging to the 1000 most frequent word families


BASIC ITALIAN VOCABULARY & GRAMMAR. VOCABULARY. Personality/Character Common Passato Prossimo ... With the words Dovere (Had to).

1000 most common korean words pdf

1000 most common korean words pdf. Most of us learning Korean language while looking for a way to memorize basic Korean vocabulary as quickly as possible

Press Coverage of the Refugee and Migrant Crisis in the EU: A

In contrast migrant (migrante) was the most used term in Italy (35.8%) and concurrence of words used to describe refugees (e.g. refugee asylum seeker

How Large is the Core of Language

This empirical finding tested on corpora of Czech English and Italian parts: the most frequent synsemantic words and the rest – autosemantic branch.

Italian Short Stories For Beginners A Short Story [PDF] - m.central.edu

out a book Italian Short Stories For Beginners A Short Story as well as it is learn the 1001 most common phrases to ... Italian Phrases 1000+ Words In.

Package rcorpora

17 Jul 2018 planets; words: adjectives verbs

Introduction to the B1 Preliminary Vocabulary List

The English Vocabulary Profile shows the most common words and phrases that learners of English need to know in British or. American English. The meaning of 

A2 Key vocabulary list

The English Vocabulary Profile shows the most common words and phrases that learners of English need to know in British or. American English. The meaning of 

Package rcorpora

Package 'rcorpora"

July 17, 2018

TitleA Collection of Small Text Corpora of Interesting Data


MaintainerGábor Csárdi AuthorDarius Kazemi, Cole Willsea, Serin Delaunay, Karl Swedberg, Matthew Rothenberg, Greg Kennedy, Nathaniel Mitchell, Javier Arce, Mark Sample, Parker Higgins, Allison Parrish, Matthew Hokanson, Aaron Marriner, Casey Kolderup, Michael Paulukonis, Neil Freeman, nathan lachenmyer, Brett O"Connor, Christian Leon Christensen, David Edgar, Greg Borenstein, Jeffery Bennett, Kris Baillargeon, M. Nowak, Peter Organisciak, Rachel White, Tod Robbins, John Wiseman, Alex Fox, Alice Maz, Becca Ricks, Chris Spurgeon, Colin Mitchell, David Whitten, Mary Dickson Diaz, Michael R. Bernstein, Mike Watson, Patrick Rodriguez, Rebecca Sherman, Rebecca Turner, Ross Barclay, Ross Binden, Ryan Freebern, Will Hankinson, Stefan Bohacek, Justin Alford, Brian Detweiler, Ed Lea, John Ohno, Daniel McNally, Sean May, Tariq Ali, shubham kumar, adam malantonio, Alan Hussey, Amanda Visconti, Andreas Fuchs, Andy Craze, Andy Dayton, Ashur Cabrera, Austin Davis-Richardson, Ben Williams, Brian Chitester, Brian Gawalt, Brian Jones, Casey Olson, Chad Nelson, Cliff Rodgers, Cristian Rivas Gómez, Dan Sumption, Edward Loveall, Elijah Cobb, Garrett Miller, Grant Williamson, Ian McCowan, Jacob Fauber, Jay Mahabal, Jeoff Villanueva, Jesse Spielman, Joe Mahoney, Jordan Killpack, Josh Leong, Kay Belardinelli, K Adam White, Kristian Wichmann, Kyle McDonald, Liam Cooke, Marcos Wright-Kuhns, Mark Wunsch, Matt Beiswenger, Matthew McVickar, Matthew Molnar, Max Bittker, Michael Dewberry, Nathan Black, Noah Kantrowitz, Noah Swartz, Ranjit Bhatnagar, Ray Martinez, Rob Huzzey, Ryan Giglio, Sabareesh Iyer, Sam Raker, Tia Esguerra, Utsav Chadha, Vincent Bruijn, Will Thompson, Zac Moody, aarón montoya-moraga, Alex Miller, Delacannon, Scott Lieber, Pace Ricciardelli, Ruta

Kruliauskaite, Scott Grant

DescriptionA collection of small text corpora of interesting data. It contains all data sets from "dariusk/corpora". Some examples: names of animals: birds, dinosaurs, dogs; foods: beer categories, pizza toppings; geography: English towns, rivers, oceans; humans: authors, US presidents, occupations; science: elements, planets; words: adjectives, verbs, proverbs, US president quotes.







Rtopics documented:

categories . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 corpora . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Index14categoriesList data set categories in the corpora packageDescription

List data set categories in the corpora package

Usage categories() Value Character vector of category names.corporaLoad a data set from the corpora packageDescription

corpora is a collection of small corpora of interesting data for the creation of bots and similar stuff.

Usage corpora(which, category)


whichThe data set to load, a string. If not given, then all data sets in the package are listed. categoryIf given,whichmust be missing, and the data sets in the given category are listed. corpora3


This project is a collection of static corpora (plural of "corpus") that are potentially useful in the

creation of weird internet stuff. I"ve found that, as a creator, sometimes I am making something

that needs access to a lot of adjectives, but not necessarily every adjective in the English language.

So for the last year I"ve been copy/pasting an adjs.json file from project to project. This is kind of

awful, so I"m hoping that this project will at least help me keep everything in one place. I would like this to help with rapid prototyping of projects. For example: you might use nouns.json to start with, just to see if an idea you had was any good. Once you"ve built the project quickly

around the nouns collection, you can then rip it out and replace it with a more complex or exhaustive

data source. I"malsohopingthatthiscanbeusedasateachingtool: maybesomeonehasthreehourstoteachhow to make Twitter bots. That doesn"t give the student much time to find/scrape/clean/parse interesting data. My hope is that students can be pointed to this project and they can pick and choose different interesting data sources to meld together for the creation of prototypes.

See https://github.com/dariusk/corpora

Value A data frame containing the data set (ifwhichis given), or a character vector of data set names.

Data set categories

animals archetypes architecture art colors corporations di vination film-tv foods g ames g ames/bannedGames g ames/bannedGames/argentina g ames/bannedGames/brazil g ames/bannedGames/china g ames/bannedGames/denmark geograph y go vernments humans instructions materials mathematics medicine music


mythology objects plants religion science societies_and_groups societies_and_groups/fraternities sports sports/football technology transportation tra vel w ords w ords/emoji w ords/literature w ords/stopwords w ords/word_clues

Data sets

animals/birds_antarcticaBirdsofAntarctica, groupedbyfamilySource: https://en.wikipedia.org/wiki/List_of_birds_of_Antarctica

animals/birds_north_americaBirdsofNorthAmerica, groupedbyfamilySource: http://listing.aba.org/aba- checklist/ animals/cats animals/collateral_adjectivesCollateral adjectives for animals. animals/common animals/dinosaursA list of dinosaurs. animals/dog_names1000 popular dog names from the New York City Department of Health"s dog licensing data. Names are roughly in order, but that may not be totally reliable. animals/dogsA list of dog breeds. animals/donkeys animals/horses animals/ponies archetypes/artifactArtifact archetypes. archetypes/characterCommon character archetypes. archetypes/eventArchetypal events. archetypes/settingSetting and location archetypes. architecture/passagesWays to enter or exit a place. architecture/roomsDifferent kinds of rooms art/ismsA list of modernist art isms. colors/crayolaList of Crayola crayon standard colors corpora5 colors/dulux colors/google_material_colors colors/paintsList of assorted paint colors from various brands. colors/palettesThe top 200 most popular palettes on colourlovers.com colors/web_colorsList of named HTML colors colors/xkcdThe 954 most common RGB monitor colors, as defined by several hundred thousand participants in the xkcd color name survey. corporations/carsA list of car manufacturers. corporations/djiaCorporations of the Dow Jones Industrial Average corporations/fortune500The 2014 Fortune 500 list

corporations/industriesAlistofallindustriesonLinkedIn, asofMay21, 2013Source: http://robertwdempsey.com/liindustries

corporations/nasdaqCorporations of the NASDAQ 100 corporations/newspapersA list of newspapers scraped in early 2013. divination/tarot_interpretationsTarot card interpretations, from Mark McElroy"s _A Guide to Tarot Meanings_ (http://www.madebymark.com/a-guide-to-tarot-card-meanings/) divination/zodiacZodiac signs and associated information, both Western and Eastern. Source: film-tv/game-of-thrones-housesGame of Thrones Houses film-tv/iab_categories film-tv/netflix-categoriesNetflix Movie Categories. film-tv/popular-moviesA bunch of movies, mostly Best Picture winners or nominees, scraped from the web. foods/apple_cultivarsThe 1000 most popular apple cultivars in the USDA"s Pomological Water- color collection.

foods/bad_beersBeerswiththe100lowestscoresonBeerAdvocate, adaptedfromhttps://www.beeradvocate.com/lists/bottom/

foods/beer_categoriesA list of beer categories. foods/beer_stylesA list of beer styles. foods/breads_and_pastriesA list of classic breads and sweet pastries. foods/combineA list of recipe instructions. foods/condimentsA list of condiments foods/curdsA list of curds, cheeses, and other fermented dairy products foods/fruitsA list of fruits. foods/herbs_n_spicesA list of herbs and spices, and mixtures of the two. foods/hot_peppersCapsicum cultivars (hot peppers) foods/iba_cocktailsCocktails recognized by the International Bartenders Association for use in the World Cocktail Competition. foods/menuItemsA list of the top 1000 most appearing menu items from the 1850s to today from the New York Public Library"s "What"s on the menu?" project. Please credit The New York Public Library as source on any applications or publications. http://menus.nypl.org/data foods/pizzaToppingsA list of pizza toppings. foods/sandwichesA list of sandwiches.


foods/sausagesA list of sausages foods/scotch_whiskeyA list of scotch whiskies foods/teatypes of tea

foods/vegetable_cooking_timesApproximatecookingtimesforvariousvegetablesSource: http://recipes.howstuffworks.com/tools-

foods/vegetablesA list of vegetables. foods/wine_descriptionsA list of words commonly used to describe wine. games/bannedGames/argentina/bannedListA list of video games banned in Argentina games/bannedGames/brazil/bannedListA list of video games banned in Brazil games/bannedGames/china/bannedListA list of video games banned in China. games/bannedGames/denmark/bannedListA list of video games banned in Denmark games/cluedoCharacters, rooms and weapons from the board game Cluedo / Clue. games/dark_souls_iii_messagesOrganized components from the Dark Souls III message system games/jeopardy_questionsA sampling of 1000 Jeopardy questions and metadata. For the full

dataset, seehttp://www.reddit.com/r/datasets/comments/1uyd0t/200000_jeopardy_questions_in_a_json_file/

games/pokemonSource: https://github.com/UberGames/iPokedex-DB games/scrabbleTile distribution and points for the English-language edition of Scrabble games/street_fighter_iiStreet Fighter II fighting moves games/trivial_pursuitPie categories and colors from Trivial Pursuit games/wrestling_movesA list of professional wrestling moves games/zelda geography/canada_provinces_and_territoriesA list of Canadian provinces and territories. geography/countriesA list of countries. geography/countries_with_capitalsA list of countries and its respective capitals. geography/english_towns_citiesTwo lists: one for English towns, one for English cities. geography/japanese_prefecturesJapanese regions and prefectures. geography/london_underground_stationsLondon Underground stations, with their lines and TravelcardzonesSource: https://en.wikipedia.org/wiki/List_of_London_Underground_stations

geography/nationalitiesAlistofnationalities. Source: https://www.gov.uk/government/publications/nationalities/list-

of-nationalities geography/norwegian_citiesc("TopNorwegianCitiesby2017populationSource: NorwayPop- ulation 2017 (Demographics, Maps, Graphs)", "Top Norwegian Cities by 2017 population Source: http://worldpopulationreview.com/countries/norway-population") geography/nyc_neighborhood_zipsNeighborhoods of New York City and their corresponding ZIP codes. Normal ZIP code caveats apply. Source: Compiled by United Health Fund and dis-

tributedbytheNewYorkStateDepartmentofHealth: https://www.health.ny.gov/statistics/cancer/registry/appendix/neighborhoods.htm

geography/oceansA list of oceans and seas. Source: http://en.wikipedia.org/wiki/List_of_seas geography/riversA list of rivers. Source: http://en.wikipedia.org/wiki/List_of_rivers_by_length geography/sf_neighborhoodsSan Francisco neighborhoods and their locations geography/us_airport_codesIATA and ICAO airport codes for the primary commercial airports in each state. corpora7 geography/us_citiesTop 1000 U.S. cities by population (2016 estimates) Source: US Census

American Community Survey 2016 5-year Data

geography/us_countiesU.S.CountiesbyStateSource: https://en.wikipedia.org/wiki/List_of_counties_by_U.S._state

geography/us_metropolitan_areasU.S. Metropolitan, Micropolitan and Combined Statistical Areas with 2016 population estimates Source: US Census American Community Survey 2016

5-year Data

geography/us_state_capitalsU.S. State Capitals Source: Wikipedia: List of U.S. state capitals geography/venuesVenuesorganizedbycategory. Source: https://developer.foursquare.com/categorytree

geography/windsAlistofregionalandlocalwindsandweatherphenomena. Source: https://en.wikipedia.org/wiki/List_of_local_winds,

andrelateddatabasesthroughouttheworld. Source: Datafoundhere: https://en.wikipedia.org/wiki/List_of_government_mass_surveillance_projects

governments/nsa_projectsA list of NSA project code names. Source: All data here is from

governments/uk_political_partiesAlistofukpoliticalparties. Source: http://www.electoralcommission.org.uk/

export on 8th May 2015 governments/us_federal_agenciesA list of federal agencies. Source: This data was sourced


governments/us_mil_operationsCode names for US Military Operations Source: All names from the scraped pages of http://www.designation-systems.net/usmilav/codenames.html humans/2016_us_presidential_candidatesAll individuals who filed a Statement of Candidacy with the FEC to register as a presidential candidate in the 2016 United States election. humans/atus_activitiesActivity category codes used by the US Bureau of Labor Statistics in its American Time Use Survey. Categories either come with a set of example activities, or are

standalone"miscellaneous"categoriesdenoted"notelsewhereclassified". Source: https://www.bls.gov/tus/lexicons.htm

humans/authors humans/bodyPartsA list of common human body parts. humans/britishActorsA bunch of British actors. humans/celebritiesCelebrities

humans/descriptionsAlistofadjectivesfordescribingpeople, takenfromwww.enchantedlearning.com/wordlist/adjectivesforpeople.shtml

humans/englishHonorificsEnglish honorifics. humans/famousDuosFamous duos humans/firstNamesFirst names of men and women, pulled from the US Census for the 2000s. humans/lastNamesLast names of people, pulled from the US Census for the 2000s. humans/moodsA list of words that naturally complete the phrase "They were feeling...". humans/norwayFirstNamesBoysFirstnamesofboys, pulledfromStatisticsNorway2015. Sorted from high to low distribution. humans/norwayFirstNamesGirlsFirstnamesofgirls, pulledfromStatisticsNorway2015. Sorted from high to low distribution. humans/norwayLastNamesLast names of people, pulled from Statistics Norway 2015. Sorted from high to low distribution. humans/occupationsA list of occupations (jobs that people might have). humans/prefixesPrefixes taken from a form on an airline website.


humans/richpeopleA bunch of rich people from a Forbes listicle, including the source article, img, and name humans/scientistsList of particularly famous scientists humans/spanishFirstNamesA list of common Spanish first names of men and women. Source: https://github.com/olea/lemarios humans/spanishLastNamesAlistofcommonSpanishlastnames. Source: https://github.com/olea/lemarios humans/spinalTapDrummersDeceaseddrummersfromthefictionalrockbandSpinalTap, taken from Wikipedia. humans/suffixesSuffixes taken from a form on an airline website. humans/thirdPersonPronounsThird person personal pronouns with case

humans/tolkienCharacterNamesCharacternamesfromTolkien"sMiddleEarth, fromhttps://en.wikipedia.org/wiki/List_of_Middle-

earth_characters The ID here matches the one in the corpora/data/words/us_president_quotes.json file humans/wrestlersA bunch of WWE wrestlers nicknames instructions/laundry_careA list of laundry care instructions materials/abridged-body-fluidsabridged body fluids materials/building-materialsbuilding materials materials/carbon-allotropescarbon allotropes materials/decorative-stonesdecorative stones materials/fabricsfabrics materials/fibersfibers materials/gemstonesA list of the names of materials commonly used as gemstones Source: materials/layperson-metalslayperson metals materials/metalsmetals materials/natural-materialsnatural materials materials/packagingpackaging materials/plastic-brandsplastic brands materials/sculpture-materialssculpture materials materials/technical-fabricstechnical fabrics mathematics/fibonnaciSequenceThe first 1000 numbers in the Fibonnaci Sequence mathematics/primesThe first 1000 prime numbers. mathematics/primes_binaryThe first 1000 prime numbers in binary. mathematics/trigonometryA list of trigonometric functions, formulas, equations, etc.. medicine/diagnosesInternational Statistical Classification of Diseases and Related Health Prob- lems, 10th revision Source: http://www.cdc.gov/nchs/icd/icd10cm.htm medicine/drugNameStemsA list of generic pharmaceutical drug name stems. Hypens indicate

whetherastemappearsatthebeginning, middle, orendofthename. Source: http://druginfo.nlm.nih.gov/drugportal/jsp/drugportal/DrugNameGenericStems.jsp

medicine/drugsA list of pharmaceutical drug names Source: The United States National Library of Medicine, http://druginfo.nlm.nih.gov/drugportal/ medicine/hospitalsA partial list of the hospitals in the United States Source: Wikipedia - List of HospitalsintheUnitedStates, https://en.wikipedia.org/wiki/Lists_of_hospitals_in_the_United_States corpora9

music/a_list_of_guitar_manufacturersAlistofguitarmanufacturersSource: https://en.wikipedia.org/wiki/List_of_guitar_manufacturers

music/bands_that_have_opened_for_toolBands that have opened for Tool. You must be really dedicated to your music if you are willing to play before Tool fans.

music/female_classical_guitaristsalistofwomenclassicalguitaristsSource: https://en.wikipedia.org/wiki/List_of_women_classical_guitarists

music/genresA list of musical genres taken from wikipedia article titles. by them in the Original Broadway Cast recording of Hamilton: An American Musical. Actors

whoplayedmultiplecharactersarelistedmultipletimes. Source: https://en.wikipedia.org/wiki/Hamilton_(musical)#Principal_roles_and_major_casts

music/instrumentsMusical Instruments

music/mtv_day_oneMusicvideosbroadcastonMTV"sfirstdaySource: https://en.wikipedia.org/wiki/First_music_videos_aired_on_MTV

music/rock_hall_of_fameArtists who have been added to the Rock N" Roll Hall of Fame along

withtheiryearofinductionSource: https://en.wikipedia.org/wiki/List_of_Rock_and_Roll_Hall_of_Fame_inductees

music/xxl_freshmanEvery rapper that"s ever made the XXL Annual Freshman Cover mythology/greek_godsGods and goddesses from Greek myth mythology/greek_monstersMonsters from Greek myth mythology/greek_myths_master mythology/greek_titansTitans from Greek myth mythology/hebrew_godHebrew names of God used in the Old Testament Bible mythology/lovecraftDeities and supernatural creatures from the works of Lovecraft and the

Cthulhu mythos.

mythology/monstersA list of monsters and other mythic creatures mythology/norse_godsGods and goddesses of norse and germanic myth objects/clothingList of clothing types objects/corpora_winnersWinnersintheCorporaBrackets, fromhttps://twitter.com/corporabrackets objects/objectsList of household objects plants/cannabis420 popular strains of cannabis plants/flowers

plants/plantsListofplantsbycommonnameSource: https://en.wikipedia.org/wiki/List_of_plants_by_common_name

religion/christian_saints religion/fictional_religions religion/parody_religions religion/religions science/elements

science/hail_sizeAnalogousobjectsforvarioushailsizes, adaptedfromhttp://www.spc.noaa.gov/misc/tables/hailsize.htm

science/minor_planetsList of names of the first 1000 numbered minor planets science/planetsPlanets (including dwarf planets as recognized by the IAU) that orbit the Sun, with their natural satellites. science/pregnancy science/toxic_chemicals science/weather_conditionsAlistofphrasesdescribingweatherconditions. Thislistincludesall possible phrases that may be provided by the US National Weather Service"s feeds of current weather conditions. Source: http://w1.weather.gov/xml/current_obs/weather.php


societies_and_groups/animal_welfare societies_and_groups/semi_secret sports/football/epl_teamsCurrent (as of November 2016) teams in the EPL (English Premier

League) and where they play

sports/football/laliga_teamsTeams in the Spanish Primera División, La Liga(2017-18) with their details sports/football/serieATeams in the Italian First División, Serie A(2017-18) with their details sports/mlb_teamsCurrent (as of 2016) Major League Baseball teams and where they play sports/nba_mvpsNBA MVP award winners 1956-2017 sports/nba_teamsCurrent (as of 2016) teams in the NBA and where they play sports/nfl_teamsCurrent (as of 2016) teams in the NFL and where they play sports/nhl_teamsCurrent (as of 2016) teams in the NHL and where they play sports/olympicsOlympic Games with host city, host nation, olympiad number (different for win- ter and summer), year, start date, end date, countries participating, athletes participating, and number of events. Source: Compiled from information on Olympics.org technology/appliancesA list of home appliances technology/computer_sciencesnames of technologies related to computer science corpora11 technology/fireworksA list (ooh!) of firework effects (aah!) technology/guns_n_riflesweapons used in mass shootings in the U.S.A. technology/knotsA list of knot names. technology/lispa list of LISP dialects technology/new_technologiesnew or emerging technologies technology/photo_sharing_websitesPhoto sharing websites technology/programming_languages technology/social_networking_websitesSocial networking websites technology/video_hosting_websitesVideo hosting websites transportation/commercial-aircraft travel/lcc words/adjsA list of English adjectives.quotesdbs_dbs31.pdfusesText_37
[PDF] 1000 most common words in japanese

[PDF] 1000 most common words in korean

[PDF] 1000 most common words in portuguese

[PDF] 1000 most common words in spanish

[PDF] 1000 regular verbs pdf

[PDF] 1000 spanish verbs pdf

[PDF] 1000 useful expressions in english

[PDF] 1000 words essay about myself

[PDF] 1000 words essay about myself pdf

[PDF] 10000 cents to dollars

[PDF] 10000 most common english words with examples and meanings

[PDF] 10000 most common english words with meaning pdf

[PDF] 1000ml over 12 hours

[PDF] 100mhz 5g

[PDF] 101 ambulance for sale