Session 1 : Session 1 Introduction to XML
Objectives : Objectives Discuss XML
List and explain advantages of XML
Explain XML document structure
Explain XML scope and application
Explain well-formed and valid XML documents
Explain how to use DTD to generate XML document
Explain Namespaces
Introduction 2-1 : Introduction 2-1 XML or Extensible Markup Language, can define their own set of tags
Make it possible for people or programs to understand these tags. Hierarchy of Markup Language SGML HTML XML XML Is a meta-language A text-based format that lets developers describe, deliver, and exchange structured data between a range of applications Facilitates the transfer of structured data between servers Allows identification, exchange and processing of data understood by databases using custom formats
Introduction 2-2 : Introduction 2-2 Demonstration: Example 1
CHINA GARDEN
3336767
25th St.
Beijing
China
20056
chinagarden@china.com
. . .
XML Declaration- defines XMLVersion XML tags containing data
Benefits of XML : Benefits of XML XML Web Server
DB Access, Integration Business rules Database HTML
View #1 HTML
View #2 Multiple views created from XML based Data XML exchanged over HTTP Desktop Middle-Tier Storage Three Tier Architecture Benefits of XML Technological benefits Business Benefits Information Sharing Single application usage Content Delivery Re-use of data Separation of data and presentation Extensibility Semantic Information Other Benefits Demonstration: Benefits of XML
XML Scope and Application : XML Scope and Application Valuable both to the Internet and large corporate intranet environments.
Provides interoperability using a flexible, open, standards-based format.
Applications can be built more quickly.
Easier to maintain
Can easily provide multiple views of the structured data using different style sheets
Examples for the use of XML:
SABRE
Chemical Markup Language
XML Document structures : XML Document structures Composed of sets of "entities" identified by unique names.
Begin with a "root" or document entity.
]>
Tom
How are you
XML Declaration Document Type Definition Entity Definition Document Element
Creating an XML Document : Creating an XML Document Steps involved in building an XML document :
State an XML declaration
Create a root element
Create the XML code Demonstration: Creating an XML Document At least one element required XML tags are case sensitive Tags should be ended correctly Nest tags properly Legal tags should be used Length of markup names Valid attributes should be defined The document should be verified
Data Versus Markup : Data Versus Markup XML document consists of data and the markup that describes the data. Jackie Chan Markup Character data
Comments : Comments The syntax for a comment: Rules followed while using comments "-" or “—” should not be included within the text of the comment A comment should never be placed within a tag A comment should never be placed inside an entity declaration or before the XML declaration. Comments can be used to comment tag sets Comments cannot be nested Processing Instruction Name of the application Instruction Information
Classification of the Character Data between Tags : Classification of the Character Data between Tags The text between the start and end tags is defined as "character data“.
Character data may be any legal (Unicode) character with the exception of "<“. Classification of Character data PCDATA CDATA Will be parsed by a parser Will not be parsed by a parser
JACKIE CHAN
jackie@usa.com
Character string ‘’ is not allowed within a CDATA block, as it will signal the end of the CDATA block.
Entities 2-1 : Entities 2-1 Storage units of XML.
Used within a document to avoid typing long pieces of text repeatedly.
Some of the predefined entities representing characters Categories of Entities General Entities Parameter Entities General Entities Entities that can appear anywhere in an XML document Internal External Exist in the document where they were declared Refer to a storage unit outside the document Identifier Point at a storage unit outside the document System Public Example for general Entity:
Entities 2-2 : Entities 2-2 Parameter entity is used when entities and entity references are required to appear only in the DTD.
Parameter entities, either internal or external, are only used within the DTD.
It includes "%" specifier. Example for Parameter Entity:
The DOCTYPE Declarations : The DOCTYPE Declarations
...body of the document....
DOCTYPE Declaration Demonstration: Example 2
]>
JACKIE
&FIRSTFLOOR;
5715746
ARNOLD
&SECONDFLOOR;
6865863
XML Declaration DOCTYPE Declaration General Entity Root Node Details of Customer node
Well formed and Valid XML document : Well formed and Valid XML document An XML document is considered as well formed:
If a minimum set of requirements are satisfied by the document
A fatal error occurs if any of the requirements for well-formed-ness is not met by the document.
A valid XML document is a well-formed XML document conforming to the rules of a Document Type Definition (DTD). Parser ----------
----------
----------
----------
---------- XML document parsed by the parser Parsed document
viewed in the browser Editor with the XML document Types of Parsers Non Validating Parser Validating Parser Checks the well formed ness of the document. Checks the validity of the document using DTD
Using Document type definition (DTD) to generate XML document : Using Document type definition (DTD) to generate XML document A DTD comes in the form of a simple text file, which can be stored in a separate file or embedded within the XML file.
XML documents referencing a DTD will contain the declaration. Why to use DTD? It Verifies that the data received is valid Used to verify own data Defines the legal building blocks of an XML document Defines the document structure with a list of legal elements.
Structure of DTD 2-1 :
]
> Declaring an element Structure of DTD 2-1 In XML, an element is a logical component of the document.
An attribute, represents the characteristics of an element. General structure of DTD: Declaring an Empty element Element with data or or Element with child element Declaring a minimum of one occurrence of the same element Declaring zero or more occurrence of the same element Declaring zero or one occurrence of the same element
Structure of DTD 2-2 : Structure of DTD 2-2 Declaring mixed content Groups can be:
Sequence
Choice of sub-elements and/or subgroups.
Demonstration: Example 3
xml complete
jackie from &country;
Mac graw &rights;
&pricenotation;50
xml unleashed
Raghu from &count;
Mac graw &rights;
&pricenotation;45
Declaration of DTD
Element book has a child element details Element details has got child elements name, author, publication and price Various entities declared in the XML code
Attribute Declaration : Attribute Declaration The element ‘rectangle’ is defined as an empty element with an attribute width of 0 and of type CDATA.
Default attribute value Implied attribute value Required attribute value Fixed attribute value Enumerated attribute value ID and IDREF Attribute Types
. . .
The Topic is XML
Attribute Topicid gives the ID of element Topic
... Attribute Prev and Next points to the ID of another element IDREFS Attribute Types . . .
. . . This attribute takes multiple element Ids as its value ENTITY and ENTITIES
These attributes point to external data in the form of unparsed entities NMTOKEN, NMTOKENS
. . .
. . . Used to specify any valid XML name or names
DTD Example 2-1 : DTD Example 2-1 DTD Internal External Written directly in the XML document after the XML declaration Exists outside the content of a document Demonstration: Example 4
]
…. Internal Document Type Definitions
DTD Example 2-2 : DTD Example 2-2 Demonstration: Example 5
Con Air
Nicolas Cage
PG
Ghosts
Demi Moore
Patrick Swayze
External Document Type Definitions External DTD File
Entity Declaration in DTD : Entity Declaration in DTD The contents of the internal entities occur within the XML document. External entities are those whose contents are outside the XML document.
SYSTEM keyword is used to specify any entity external to the document
XML Namespaces 2-1 : XML Namespaces 2-1 A collection of names that can be used in XML documents as element or attribute names.
Namespaces allow the browser to:
Combine documents from different sources, and help to identify the source of elements or attributes.
Access DTDs or other description of the elements and attributes against which, the document is validated.
A Uniform Resource Identifier (URI) identifies Namespaces in XML.
Uniform Resource Name (URN): Is universally unique number that identifies Internet resources
Uniform Resource Locator (URL): Contain the reference for a document or an HTML page on the Web Need for a Namespace Help standardize and uniquely brand elements and attributes. Employ the URI to instruct the user-agent about the location of the DTD against which the XML document is checked for validity. Ensures that there is no conflict within element names, and clarify their origins
XML Namespaces 2-2 : XML Namespaces 2-2 Demonstration: Example 6
Declaration of namespace
Attributes and Namespaces : Attributes and Namespaces Attributes are considered to be within the namespace of their element, unless prefixed. . . .
Evening Batch
Morning Batch
Afternoon Batch
. . . . . .
xmlns= "http://www.Aptech_edu.ac"
xmlns:tea_batch= “http://www.tea.org">
Evening Batch
Tea batch III
Afternoon Batch
. . . class element of Aptech “inherits” the tea_batch:type attribute from the ‘tea industry’ domain. Two attributes with the same name can be also included. Tea Batch I
Namespace Application : Namespace Application Demonstration:Example7
Evening Batch
Morning Batch
Afternoon Batch
Tea batch I333 Batch
Tea batch II222 Batch
Declaration of namespace Tells the user that the batch element is defined somewhere as http://www.Aptech_edu.ac. Allows to validate and process the information about the two batches of tea
Summary 2-1 : Summary 2-1 XML is extensible, which means that we can define our own set of tags, and make it possible for other parties (people or programs) to know and understand these tags. This makes XML much more flexible than HTML. A well-formed document is one that conforms to the basic rules of XML.
The DTD specifies the grammatical structure of an XML document, thereby allowing XML parsers to understand and interpret the document’s contents.
EMPTY element-content type specifies that the element has no child elements or character data.
IDs are used to specify any valid XML name or names. It may be used when we are associating another component with the element, such as a Java class or a security algorithm.
Summary 2-2 : Summary 2-2 The DTD can be either an:
External DTD
Internal DTD
Entities allow us to create an alias to some large piece of text, so that, in the document, we can refer to the same piece of text, simply by referring to the alias.
Namespaces allow us to combine documents from different sources, and be able to identify which elements or attributes come from which source.