Arriving at a Standard Set of Categories
See
DcSubject for a place to list different categories in use by Indymedia sites.
Problem statement
How can we create a standard set of categories between IMCs spread out over 6 continents in dozens of different languages?
Necessary features:
- Multi-lingual
- Unique reference to heavily overlapping categories
- Extendable for local topics
- Can easily add new terms that we didn't thing of at first
- Matches de-centralized nature of Indymedia; avoiding central repository?
Similar Problems
- Cataloging reference materials
- Finding file types (the MIME system)
Possible Approaches
- Use external source
- Generate exhaustive list and combine duplicated subjects
- Don't define exhaustive list -- just a way to reference which IMC defined the category (e.g., if UK defines a category for environmentalism, it would be uk.indymedia.org / Environmentalism)
- Allow multiple definitions of the same subjects, using different controlled vocabularies and the dc:subject scheme to separate them. This website gives a good example of this approach.
Starting Points
Examples
IPTC classifications for news articles
The broad categories: many, many subdivisions.
01000000 |
Arts, culture, and entertainment |
02000000 |
crime, law and justice |
03000000 |
disaster and accident |
04000000 |
economy, business and finance |
05000000 |
education |
06000000 |
environmental issue |
07000000 |
health |
08000000 |
human interest |
09000000 |
labour |
10000000 |
lifestyle and leisure |
11000000 |
politics |
12000000 |
religion and belief |
13000000 |
science and technology |
14000000 |
social issue |
15000000 |
sport |
16000000 |
unrest, conflicts and war (incl. protest and social movements) |
17000000 |
weather (incl. climate change) |
Example: LOC Classification Numbers
(without sub-headings)
A |
GENERAL WORKS |
B |
PHILOSOPHY. PSYCHOLOGY. RELIGION |
C |
AUXILIARY SCIENCES OF HISTORY |
D |
HISTORY (GENERAL) AND HISTORY OF EUROPE |
E |
HISTORY: AMERICA |
F |
HISTORY: AMERICA |
G |
GEOGRAPHY. ANTHROPOLOGY. RECREATION |
H |
SOCIAL SCIENCES |
J |
POLITICAL SCIENCE |
K |
LAW |
L |
EDUCATION |
M |
MUSIC AND BOOKS ON MUSIC |
N |
FINE ARTS |
P |
LANGUAGE AND LITERATURE |
Q |
SCIENCE |
R |
MEDICINE |
S |
AGRICULTURE |
T |
TECHNOLOGY |
U |
MILITARY SCIENCE |
V |
NAVAL SCIENCE |
Z |
BIBLIOGRAPHY. LIBRARY SCIENCE. INFORMATION RESOURCES (GENERAL) |
Dewey Decimal
000 |
Generalities |
100 |
Philosophy & psychology |
200 |
Religion |
300 |
Social sciences |
400 |
Language |
500 |
Natural sciences & mathematics |
600 |
Technology (Applied sciences) |
700 |
The arts |
800 |
Literature & rhetoric |
900 |
Geography & history |
Universal Decimal Classification
0 |
GENERALITIES |
1 |
PHILOSOPHY. PSYCHOLOGY |
2 |
RELIGION. THEOLOGY |
3 |
SOCIAL SCIENCES |
4 |
VACANT |
5 |
NATURAL SCIENCES |
6 |
TECHNOLOGY |
7 |
THE ARTS |
8 |
LANGUAGE. LINGUISTICS. LITERATURE |
9 |
GEOGRAPHY. BIOGRAPHY. HISTORY |
First pass at an IMC Controlled Vocabulary
This approach uses the IPTC newscodes as a starting point. It moves protest (a subset of the 16s) to their own category and adds a few other new categories, including one for Indymedia. Categories with a ? are less likely to be used.
General approach: up to 99 broad categories, 01-99, with sub-categories. E.g., broad category: 01000000. Subcategory: 01100000. No need for "synthetic" classification approach since each article/feature can have multiple associated categories.
Goal: make a sensible hierarchy, without leaving out any important IMC categories. Secondary goal: make at least part of our hierarchy compatible with the IPTC newscodes.
Code |
English |
01000000 |
Arts, culture, and entertainment |
02000000 |
crime, law and justice |
03000000 |
disaster and accident |
04000000 |
Capitalism , Corporations, economy and Business |
04100000 |
globalization, e.g. |
05000000 |
education |
06000000 |
environmental issue |
07000000 |
health |
08000000 |
human interest? |
09000000 |
labor |
10000000 |
lifestyle and leisure? |
11000000 |
politics and governments |
12000000 |
religion and belief |
13000000 |
science and technology |
14000000 |
social issues - this category contains homelessness, racism, gender, lbtiq issues, etc |
14003000 |
demographics, e.g. |
14003002 |
immigration, e.g. |
15000000 |
sport? |
16000000 |
unrest, conflicts and war - incl. political dissent, protest, etc |
17000000 |
weather (incl. climate change) |
18000000 |
Social Movements |
18100000 |
Zapatista, e.g. |
19000000 |
Indymedia |
19100000 |
Indymedia global, e.g. |
20000000 |
Repression? |
20100000 |
State Authority, e.g. |
20200000 |
Police Authority, e.g. |
22000000 |
Analysis/Commentary/Tactical Discussion |