Browse code

initial technical memos submitted

Jiri Kuthan authored on 08/01/2003 14:01:52
Showing 4 changed files
... ...
@@ -1,2 +1,19 @@
1
-This directory contains short memos documenting design decisions
2
-made in ser or accompanying applications.
1
+This directory contains short technical memos documenting 
2
+technical decisions made or planned to be made in ser or 
3
+accompanying applications. The memos serve as requests
4
+for comments in the literal sense (not to be confused
5
+with IETF's RFCs).
6
+
7
+The documents here are drafts, for whose technical maturity
8
+no guarantee can be provided.  They may advocate non-workable
9
+design ideas, frequently change, or be replaced by better
10
+technology suggestions. They may or may not be implemented.
11
+
12
+The memo texts follows IETF traditions: they are encoded
13
+in plain ASCII. Their filenames consists of
14
+ - tmemo (=technical memo -- to avoid confusion with IETF prefixes)
15
+ - author id
16
+ - text id
17
+For example: tmemo-johndoe-backtobackua.txt
18
+No version numbers are used in filename -- these are displayed 
19
+in text and assigned by CVS server. 
3 20
new file mode 100644
... ...
@@ -0,0 +1,259 @@
1
+$Id$
2
+
3
+Building Prepaid Scenarios Using SIP/SER
4
+========================================
5
+
6
+Jiri Kuthan, iptel.org, January 2003
7
+
8
+Abstract
9
+--------
10
+Prepaid scenarios for making calls to PSTN gateways require the 
11
+ability to terminate an exising call when user's credit is 
12
+exhausted. Though it seems appropriate to implement such 
13
+a feature in the device providing the service (i.e., in the gateway),
14
+we are currently not aware of such gateways. We thus first
15
+recommend a session-timer based approach which possibly works,
16
+and requires limited support in end-devices (session-timer)
17
+and proxy servers (session-timer and call length determination).
18
+We then discuss another alternative, based on a B2BUA middlebox,
19
+which works even with the dumbest PSTN gateways but puts
20
+a considerable workload on SIP server.
21
+
22
+TOC
23
+---
24
+Section 1 explains design alternatives which can be made when
25
+designed a "forced" call termination (FCT). The design alternatives
26
+are FCT support in end-devices (nice, but not doable with current 
27
+gateways), FCT support using session-timer (nice, hopefuly doable, 
28
+requires session-timer support) and FCT using a B2BUA (ugly
29
+and costly, but best backwards-compatible).
30
+
31
+Section 2 details known drawbacks of the B2BUA technology.
32
+
33
+Preliminary hints how to implement the B2BUA using ser,
34
+which has no B2BUA support, are detailed in section 3.
35
+
36
+1. How To Terminate a Call When No Money Is Left
37
+--------------------------------------------------
38
+
39
+In general, there are many ways to implement a service operator
40
+driven call cut-off. We argue, that architecturally best acceptable
41
+place for this functionality is in the terminating PSTN end-device.
42
+The device already keeps session state, it knows too when things
43
+go wrong on the PSTN site, it is able to detect caller's media
44
+inactivity -- it is simply full in control of the call. Thus it
45
+seems an ideal place for implementing a call termination functionality.
46
+No other element in the network knows all the things the end-device 
47
+knows. 
48
+
49
+The missing piece is then the ability to determine maximum call
50
+duration. A consequent application of the approach of placing the
51
+logic in end-device would make the gateway query some database.
52
+(It is of limited use to include this information directly in
53
+in gateway, as multiple devices may want to share this piece of
54
+information.) However support of such a "query-credit" protocol
55
+does not exist in PSTN gateways. Other solutions are thus sought.
56
+
57
+One way to make the gateway aware of the maximum call duration is
58
+to determine it in a proxy server (which typically has programming
59
+capabilities that allow doing so) and propagating it to gateways 
60
+using SIP session timer. 
61
+  http://www.iptel.org/ietf/callsignalling/#draft-ietf-sip-session-timer
62
+This solution is scalable in that the element determining the maximum
63
+call length is a proxy server, which is at most transactionally
64
+stateful. No call state needs is maintained except in the
65
+end-devices. 
66
+
67
+The behaviour of the session-timer-based construct is as follows:
68
+a caller intiates a call through a proxy server. The proxy server
69
+determines maximum acceptable call length and inidiacates it using
70
+the session timer mechanisms. The timer is then propagates to
71
+the end-device using SIP. If it actually hits, the terminating 
72
+gateway will try to revitalize the session using a re-INVITE. 
73
+The proxy server then can recalculate available credit, and if too 
74
+low, deny the re-invitation. The end-device is then supposed 
75
+to terminate the call using a BYE.
76
+
77
+We have never experimented with the session-timer-based solution.
78
+We do not know if some session timer negotiation troubles can
79
+occur. We do not know how widely support of session timer is
80
+deployed in gateways. We do not know whether the standardization
81
+effort for session-timer will result in some changes and when
82
+it will complete. Nevertheless, we think it is worth trying.
83
+Its appeal is it leaves call-termination, a call-stateful
84
+feature, down in the end-devices and does not pose too big
85
+burden on server developers and especially operators.
86
+
87
+WE THUS ENCOURAGE VOLUNTEERS TO EXPERIMENT WITH THIS OPTION.
88
+TAKE THE GATEWAY YOU HAVE, LOOK AT IT IF IT SUPPORTS ST,
89
+ADD ST TO SER PROXY AND CHECK IF THINGS WORK.
90
+
91
+
92
+
93
+2. What Are B2BUA limitations?
94
+------------------------------
95
+B2BUA features all drawbacks of a centralized solution. Whereas
96
+B2BUAs are applicable in the prepaid scenarios one should not
97
+forget the price.
98
+
99
+a) it is a single point of failure. When in the middle of
100
+   a conversation additional sigaling occurs and the B2BUA
101
+   is down, signaling will fail. (Doesn't happen if signaling
102
+   runs only between end-points.) Call persistency must be
103
+   implemented, signaling will otherwise fail on server
104
+   reboot.
105
+b) scalability issues: a B2BUA needs to keep state for two
106
+   calls for the whole duration of a conversation. That might
107
+   be an issue with too heavy traffic. Transaction state
108
+   takes 3k per transaction and lasts seconds. Call state
109
+   consumes at least twice so much and lasts minutes.
110
+c) e2e security does not work -- implementations willing to
111
+   achieve high security will want to encrypt and sign
112
+   SIP message bodies. B2BUA breaks the e2e security if
113
+   it needs to change the body.
114
+d) economical aspects: it is simply yet another piece of
115
+   software you need to purchase or develop
116
+
117
+Lot of this conversation has taken place on IETF SIP
118
+and SIPPING mailing lists. Few messages from these
119
+discussions are referred from  
120
+   http://www.iptel.org/info/trends/#b2bua
121
+
122
+3. How to Implement a B2BUA Using ser
123
+-------------------------------------
124
+
125
+
126
+
127
+At 10:00 AM 1/6/2003, chang hui wrote:
128
+>Jiri,
129
+>
130
+>Thanks for your explanation, and let me know the architecture drawback of the B2BUA.
131
+
132
+
133
+I've already done so in my previous email. If something was not clear
134
+enough, let me know.
135
+
136
+
137
+>Since we have no way to choose other means to implement pre-paid, we have to go along with B2BUA in a short term.
138
+>Could you give me any advise how to implement B2BUA based on SER and estimate the work we should do?
139
+>Could you give me a performance estimate?
140
+
141
+
142
+A hand-waving guestimate is performance degrades by 50%.
143
+(We currently achieve up to 3-5 kCPS on a PC -- fair capacity
144
+ to slice off from.).
145
+
146
+
147
+a B2BUA does a lot of things:
148
+- first, it keeps dialog state -- it rememembers cseq,  callid, 
149
+  route set, etc. for the whole time of a call (i.e., it eats 
150
+  memory). All this information is needed when you later wish 
151
+  to initiate correct BYEs.
152
+- it translates UAC to UAS transactions and vice versa
153
+- you probably want to save the dialog state on some persistence
154
+  storage (mysql) -- signaling would not work on reboot otherwise
155
+
156
+
157
+That would take quite some development work. I think the amount
158
+of work can be somewhat lowered if normal (record-routed) proxy 
159
+processing is used, as opposed to a full B2BUA which terminats
160
+all UAS transactions and translates them to UAC transactions.
161
+You then still need to do the following:
162
+- keeping a dialog table (keyed by callid and local/remote tags)
163
+- updating the dialog table (new items on INVITE completion, remove 
164
+  dialogs on BYE, update dialog state, such as CSeq, on any other 
165
+  request).
166
+- starting a timer on beginning of a dialog that -- when expired,
167
+  subject to balance and charging plans --  sends BYEs to all call  
168
+  parties using dialog context.
169
+
170
+
171
+That could be implemented as a new ser module, which registers
172
+TM-module callbacks to be updated on transactions completions.
173
+One could also move the dialog maintenance out of ser to some
174
+shell scripts to make programming easier. That would however
175
+very likely degrade performance noticeably.
176
+
177
+
178
+Also note, that these scenarios are based only on signaling -- there
179
+are no PSTN-prepaid-style anouncement "you can call 5 minutes"
180
+and "your call will be cut off in 20 seconds". It is doable too,
181
+but it is probably more meaningful to start with the signaling
182
+part.
183
+
184
+
185
+-Jiri
186
+
187
+
188
+
189
+>Best Regards and Thanks.
190
+>
191
+>
192
+>Chang Hui
193
+>-----Original Message-----
194
+>From: Jiri Kuthan [mailto:jiri@iptel.org]
195
+>Sent: Saturday, January 04, 2003 8:29 PM
196
+>To: chang hui; serusers@iptel.org
197
+>Subject: RE: [Serusers] About B2BUA
198
+>
199
+>Hello,
200
+>
201
+>I see -- prepaid scenarios are indeed difficult without a B2BUA.
202
+>There has been a proposal few times to use session timer (a proxy
203
+>looks at ballance and attaches a hint to SIP requests indicating
204
+>when a call should terminate), but the work has not been pursued.
205
+>
206
+>You may find a discussion of B2BUA architectural drawbacks on the
207
+>SIP mailing list, selected postings are at http://www.iptel.org/info/trends/#b2bua.
208
+>imho, the most compelling issue is that of robustness and scalability.
209
+>A b2bua needs to keep track of all current calls. A broken b2bua affects
210
+>signaling for all existing calls.
211
+>
212
+>Basically, a B2BUA is simply two UAs glued together. It accepts
213
+>transactions as a server, and initiates client transactions
214
+>based on them. It keeps dialog state (callid, cseqs, etc.) and
215
+>may initiate in-dialog transactions on its own (like the BYE
216
+>transaction in which you are interested).
217
+>
218
+>It is doable to implement a B2BUA on top of ser, but it would
219
+>cost quite some development effort. Particularly, it would take
220
+>dialog maintenance (better with persistency so that signaling
221
+>does not get broken on reboot). We  can provide guidanance to
222
+>volunteers willing to go through this exercise.
223
+>
224
+>-Jiri
225
+>
226
+>At 02:28 AM 1/4/2003, chang hui wrote:
227
+>>Jiri,
228
+>>
229
+>>Thanks for your prompt response.
230
+>>We want to implement a pre-paid system in which once subscriber's balance is depleted, the dialog could be torn in time. However other Proxy or other elements could not take part in the call, they could not send a BYE to caller directly. It's the why we consider B2BUA.
231
+>>We project to build a B2BUA to support voice/video/IM at first stage, and support other SIP based services as they emerged.
232
+>>However, I just noticed the definition of B2BUA in 2543-bis04 in several sentences,  there has no other analysis on performance, reliability, limitations and how to implement it. So, I hope to get help from the society.
233
+>>Thanks for your help again.
234
+>>
235
+>>Koalas
236
+>>
237
+>>-----Original Message-----
238
+>>From: Jiri Kuthan [mailto:jiri@iptel.org]
239
+>>Sent: Friday, January 03, 2003 11:06 PM
240
+>>To: chang hui; serusers@iptel.org
241
+>>Subject: Re: [Serusers] About B2BUA
242
+>>
243
+>>Hello,
244
+>>
245
+>>ser is not a B2BUA -- it can act as proxy, redirect, transactional UAS
246
+>>or registrar. These modes make a vast majority of network scenarios
247
+>>happy without a need to use a B2BUA. Which is good, because B2BUAs
248
+>>inherently have certain scalability, reliability and security limitations.
249
+>>
250
+>>Is there a particular reason why you would like to use a B2BUA?
251
+>>
252
+>>-Jiri
253
+>>
254
+>>At 08:00 AM 1/3/2003, chang hui wrote:
255
+>>>Hi All,
256
+>>>
257
+>>>I am newbie of this field, thanks everyone help me.
258
+>>>I am interesting in B2BUA, however, except some brief defination in 3261, I could not find any further defination or how to implement about B2BUA, I noticed that SER could be implemented as a B2BUA, where can I find some implementation? or where can I get any description?
259
+>>>
0 260
new file mode 100644
... ...
@@ -0,0 +1,395 @@
1
+$Id$
2
+
3
+
4
+Draft Distributed Media Server Architecture
5
+===========================================
6
+
7
+Jiri Kuthan, iptel.org, January 2003
8
+
9
+
10
+Abstract
11
+--------
12
+
13
+We describe design considerations made when expanding voicemail 
14
+application to a more general media server. The objective of
15
+media server is to bind voice to SIP applications with optional
16
+support of other tools (SIP SUB/NOT, mysql, TTS, etc.) It has
17
+to be configurable in such a way it can act in different component
18
+roles: click-to-dial server, voicemail server, conferencing server, 
19
+text-to-speech anouncement server, etc.
20
+
21
+TOC
22
+---
23
+
24
+Section 1, Scenarios and Component Models, explains background 
25
+assumptions on how services can be composed using Rosenberg-advocated 
26
+model. This section is essential to understanding how a media
27
+server can be plugged-in in a SIP network consisting of multiple
28
+components, each delivering a part of a complex service. The section 
29
+also suggests a decentralized architectural improvement for connecting 
30
+SIP components without a need for a B2BUA, a technology we consider 
31
+suboptimal.  (This network architecture puts only very little addition
32
+requirements on the media server.)
33
+
34
+
35
+Section 2, Media Server Requirements, explains basic requirements
36
+a media server needs to fulful to make a good job in the component
37
+architecture. Design ideas for server's key part, a programming
38
+script, are explained in section 3.
39
+
40
+Related work, references and example scripts are attached in
41
+appendices.
42
+
43
+
44
+
45
+1) Targeted Scenarios and Component Model
46
+--------------------------------------
47
+Many application scenarios can provide a pleasant experience to users 
48
+when users are played explanatory messages or users' voice feedback 
49
+can affect service logic. That is what media servers are basically
50
+good for. The whole service logic may be complex and composed of multiple 
51
+stages (initial anouncement, PIN verification, text-to-speech) which 
52
+form together a longer conversation. The individual stages may be 
53
+implemented as parts of a single media server or distributed accross 
54
+specialized (or specially configured instances of the same) media servers.
55
+
56
+Examples of such multi-stage conversations are voicemail, conferencing, 
57
+click-to-dial, and prepaid calls. Some of these scenarios have been 
58
+addressed in J. Rosenberg's disseration and an almost identical Internet 
59
+Draft co-authored by P. Mataga [components]. (See also [featureinteraction]). 
60
+They proposed a component model, in which a B2BUA faces a caller on its 
61
+UAS part, and connects to different SIP devices on its UAC part. This 
62
+B2BUA, so-call call controller, acts as a glue: it connects all possible 
63
+SIP-enabled application components together. It maintains a "service 
64
+state machines" which defines how to link components with each other 
65
+as a session proceeds. It uses HTTP as a complementary protocol for 
66
+the components to report on their progress to the controller. For example, 
67
+the controller may first connect on caller's behalf to a "pre-paid prompt 
68
+component", which queries user's PIN and reports it to the controller. 
69
+On success, the controller can then hand-off the call to a PSTN gateway.
70
+
71
+This architecture is extremelly good in that it introduces distributed 
72
+components. Decomposition, an imporant design principle, is performed 
73
+in a fair, peer-2-peer manner that allows linking SIP devices in
74
+a very flexible way.
75
+
76
+The biggest shortcoming of this architecture is imho its central piece, 
77
+the controller. It is simply too central. A B2BUA design  inherently causes 
78
+many concerns: security, scalability, and reliability ones. B2BUA solutions 
79
+proposed in 3pcc draft [3pcc] by Rosenberg have several signaling drawbacks 
80
+too: tricky media matching (flow III), backwards compatibility
81
+(flow IV), etc. There is also the economical aspect: a B2BUA
82
+costs money or development effort.
83
+
84
+We believe it is beneficial to avoid such B2BUA constructs. The mechanism
85
+we are advocating is distributing the service state machine accross 
86
+participating components. With such a scheme, it is the current component
87
+that decides what to do next, i.e., when to proceed to which next component.
88
+A caller contacts an initial component (say a PIN prompting media server) 
89
+identified by an URI, which is in fact an identifier of the initial service 
90
+state. An initial conversation is carried out then ("give me your PIN: 
91
+1-2-3-4"). The component collects the PIN and when finished, it passes 
92
+over to the next component. There is a choice to verify the PIN in the 
93
+first component and pass over the final authorization status ("no" or 
94
+"yes" or "yes but no longer than 5 mintues call") or to pass the PIN 
95
+and leave its authorization to the next component. 
96
+
97
+This construct is more distributed: the controller permanently involved
98
+in caller's conversation is gone. It is always the current component
99
+that decides what to do next. There are alway only two parties in 
100
+a relationship: caller and the current component. "middlebox" B2BUA
101
+is away.
102
+
103
+Another benefit of this more e2e-oriented approach is a better way
104
+of dealing with caller's preferences. Caller preferences are about the 
105
+ability to gain user's consent with transitions in conversation -- e.g., 
106
+is it acceptable for a caller to be transferred to a CIA server? With
107
+the REFER approach, all transition decisions are actually made
108
+by client, which is good. Other solutions, in which a downstream
109
+entity decides on caller's behalf are imho too limiting. They
110
+require the caller to upload his preferences in a standardized
111
+format to the upstream client. As the preference space is almost
112
+infinitely big, the way of standardizing caller's preferences does
113
+not seem too beneficial to us. There may be always some preferences,
114
+which the preference format does not capture. Make it simple and
115
+allow caller to decide on his own behalf. He is responsible, know
116
+what he wants and possibly does not trust the upstream client
117
+to interpret his preferences as desired.
118
+
119
+Mechanically, the transition to the next component can be easily
120
+achieved using REFER[refer]. When current component completes, it hints
121
+caller to proceed to the next one using REFER. The URI in Refer-To 
122
+represents the next component (a PSTN proxy) as well as some
123
+service attributes ("pin ok, 5 minutes permitted") with which
124
+the component can begin. When like in this case the URI carries
125
+security-sensitive information, the information may be encrypted
126
+or a message integrity check may be attached. Note that this mechanism
127
+eliminates a need for the "HTTP reporting hack" in jdr's architecture. 
128
+Session status is reported in SIP URIs. Cooperating components just 
129
+need to agree on a scheme for URI usage. That should be easy for SIP 
130
+servers as URI processing is a primary SIP ability.
131
+
132
+A simple application of this more distributed approach is REFER-based 
133
+"click-to-dial" service. In this scenario, a media component gets somehow 
134
+instructed to initiate a call. It first calls the first party, optionaly 
135
+plays a short anouncement ("you will be transfered now") and than transfers 
136
+this initial call to the other call party. It then completely disappers
137
+from sebsequent conversation.
138
+
139
+The "pre-paid verification component" referred to in this section is another 
140
+example use of this model. It establishes a call with caller, looks at 
141
+desired destination, processes PIN in media stream, and makes a decision 
142
+to hand-over to a gateway. It than disappears from the conversation.
143
+
144
+Note that the application call-control framework [ccframework] by Mahy et al. 
145
+explicitely mentions a more peer-2-peer oriented approach based on REFER as 
146
+a good alternative to a centralized B2BUA approach. 
147
+
148
+
149
+
150
+2) Media Server Requirements: Flexibility and Extensibility
151
+-----------------------------------------------------------
152
+
153
+In all such application scenarios, a media component has a central
154
+role. It plays anouncements, records messages, and interacts with
155
+caller via signaling too: it can terminate or transfer a call. 
156
+
157
+There are two major requirements on its design to make it useful
158
+for applications as mentioned above: it needs to be flexible 
159
+and extensible.
160
+
161
+Flexibility is desired to be able to configure the media server
162
+for its particular purpose without having to rewrite it each time. 
163
+It should be possible to configure whether on receipt of a 
164
+specific URI, the server plays or records a message. It should 
165
+be possible to dictate maximum call length and define what happens 
166
+when the length timer really strikes:  should the call be transferred 
167
+to another component (and if so, to which) or simply bye-d? Etc.
168
+
169
+We suggest, that like in SER this flexibility is achieved
170
+by a scripting language (see bellow).
171
+
172
+The other requirement is exensibility. The media server scripts
173
+should be able to leverage other available tools. A particular
174
+example is coupling of script logic with MySql databases --
175
+feature that made PHP an ultimate success. In context of the
176
+previous prepaid examples, it can be used to verify user's PIN and
177
+maximum possible call length. Text-to-speech software such as
178
+festival [festival], AT&T's Natural Voices [nv] or CMU
179
+speech software  [cmuspeech] (!!!) including Sphinx, festvox,
180
+openvxi are examples of other pieces of work worth intergrating
181
+with.
182
+
183
+3) On Scripting Language
184
+---------------------
185
+
186
+scope)
187
+
188
+The scripting language should be able to define call processing:
189
+establish, transfer, terminate a call, provide media processing
190
+and use external libraries (php, tts, etc.) in an extensible manner.
191
+It should stay open to integration with Internet services and
192
+allow things like HTTP queries or SIP instant messaging.
193
+
194
+call/transaction abstractions)
195
+
196
+The language should hide well protocols detail to make programming
197
+easy. While access to lower-level features should not be precluded, 
198
+abstraction and simplicity are the key for application programming. 
199
+
200
+The primary living space of the media server programming language
201
+should be calls. Scripts should be able to deal with calls:
202
+initiate, terminate and transfer them. ([ccframework] coins
203
+"replace", "join", "fork").
204
+
205
+An important lower-level escape way should be the ability to initiate
206
+in-call (in-dialogue) transaction. That is what allows the server
207
+to go beyond simple VoIP/media services. An example of use of
208
+such an ability would be sending notifications on some events
209
+(like when a new party joins a multi-party call conference)
210
+or subscribing to some call-related events:
211
+  ret=$call.new_transaction("INFO", 
212
+     "headerfield: value\n\hf2: ".$some_var."\n", "two USD");
213
+
214
+
215
+events)
216
+
217
+All of us have agreed that event-oriented approach is a good
218
+abstraction. The event system should be very universal and
219
+accept events from a variety of sources in a unified manner.
220
+The sources include but are not limited to SIP messsages, timers 
221
+(so that for example voicemail app can set the longest possible 
222
+recording), external events from local apps (perhaps via FIFO), 
223
+media events (DTMF), SIP notifications.
224
+
225
+There was a proposal too, to introduce notion of SUB/NOT and presence
226
+to the language. Examples of use are "initiate a conference call when all 
227
+invited  users are on-line", "repeat a call when called party is
228
+no longer busy" [dialogpackage], "query participant list in a multi-party
229
+conversation", etc. We haven't discussed yet whether, and if so
230
+how such scenarios should be reflected in the language.
231
+
232
+requriement summary)
233
+
234
+So far, we have identified the following requirements:
235
+    - programming effectivity (easy and intutitive to use)
236
+    - parallelism (mutltiple scripts processed at the same time, 
237
+      multiple calls refered from a single script)     
238
+    - variables (refering to multiple calls)
239
+    - event processing
240
+    - ability to change script without rebooting the server
241
+    - extensibility (i.e., the ability of the environment to link 
242
+      external binary libs and refer to them from scripts)
243
+
244
+Some design options mentioned so far (nice but not required)
245
+    - have some casting from input to variables (e.g, $request.header.callid)
246
+    - use OO -- there are many people for whom OO is easier
247
+    - exceptions to group error processing
248
+
249
+main-loop language)
250
+
251
+We have not made any determination yet on whether to resuse an
252
+existing scripting language (and bind SIP code or any other code
253
+to it from C/C++ librariries) or design our own from scratch.
254
+
255
+Proponents of language reuse (Python may be a reasonable option)
256
+are primarily concerned about too much unnecessary development 
257
+and debugging effort for both the basic language and especially 
258
+for its extensions.
259
+
260
+Opponents were concerned about difficulties with integration of
261
+the scripting languages with code libraries. Other cons are
262
+bigger image size and dependency on third-party software.
263
+However, risks of bugs and unability to tweak things are rather 
264
+low with well-established open-source software like python.
265
+Possibly, syntax of an own language might better capture
266
+semantics of the media server.
267
+
268
+As said, no determination has been made yet. Author of this
269
+memo is little a bit uncomfortable with current amount of
270
+development work put on ser team and hopes that use of an
271
+off-the-shelve language would save work cycles. (Hopefuly,
272
+this hope will not be broken by tremendous effort spent
273
+in integration with supporting libraries.)
274
+
275
+
276
+see more )
277
+
278
+Appendixes include pseudo-examples of scripts written in such
279
+languages. (An XML-based language was discussed too, but its
280
+proponent gave up on it since it was really big and difficult
281
+to read.)
282
+
283
+
284
+A) Related Work
285
+------------
286
+There has been a whole bunch of related work. Traditional IVRs
287
+were programmable decades ago. Related technologies include 
288
+[kpml], [mscl]*, [vxml], Cisco's use of TCL [ciscotcl]. 
289
+[Bayonne] has some too.  snom uses an xml-based language, 
290
+there is a voicemail system based on JavaScript and NIST SIP stack.
291
+
292
+* one of the differences between kpml and mscml is kpml uses HTTP
293
+  for reporting (similarly to [components]), MSCML uses SIP
294
+
295
+
296
+B) References
297
+----------
298
+[3pcc] http://www.iptel.org/ietf/callprocessing/3pcc/#draft-ietf-sipping-3pcc
299
+[bayonne] http://www.gnu.org/software/bayonne
300
+[ciscotcl] http://www.cisco.com/univercd/cc/td/doc/product/access/acs_serv/vapp_dev/tclivrv2/chapter1.htm
301
+[cmuspeech] http://www.speech.cs.cmu.edu/speech/
302
+[components] http://www.iptel.org/ietf/callprocessing/apps/#draft-rosenberg-sip-app-components-01
303
+[ccframework] http://www.iptel.org/ietf/callprocessing/#draft-ietf-sipping-cc-framework
304
+[dialogpackage] http://www.iptel.org/ietf/callprocessing/#draft-ietf-sipping-dialog-package
305
+[featureinteraction] http://www.iptel.org/ietf/callprocessing/apps/#draft-rosenberg-sipping-app-interaction-framework
306
+[festival] http://www.cstr.ed.ac.uk/projects/festival
307
+[mscml] http://www.iptel.org/ietf/callprocessing/apps/#draft-vandyke-mscml
308
+[kpml] http://www.iptel.org/ietf/callprocessing/apps/#draft-burger-sipping-kpml
309
+[nv] http://www.naturalvoices.att.com/
310
+[refer] http://www.iptel.org/ietf/callprocessing/refer/#draft-ietf-sip-refer
311
+        (recently approved by IESG for publication as RFC)
312
+[vxml] http://www.iptel.org/ietf/callprocessing/apps/#draft-rosenberg-sip-vxml-00
313
+
314
+
315
+C) Appendix: pseudo-scripting language
316
+------------------------------------
317
+
318
+/* voicemail */
319
+event{new_call}(call $c) {
320
+   $c.play("welcome"); /* play blocking */
321
+   new_timer(too_long, 200 sec, $c, terminate_call);
322
+   $c.record("/var/spool/voicemail/"+$c.callee; /* record non blocking */
323
+}
324
+event{eo_call}(call $c) {
325
+   // do nothing; by default, all what has been started is closed 
326
+}
327
+event{too_long}(call $c) {
328
+   $c.terminate();
329
+}
330
+
331
+/* 3pcc a la call transfer */
332
+event{click_to_dial} (uri $to, uri $from) {
333
+    $c=new_call("sip:webcaller@foo.bar" /*our daemon invites caller */, $from /* caller */);
334
+    $c.play("you will be transfered now");
335
+    $c.refer($to); /* refer creates an event ... NOTIFY */
336
+}
337
+event{notify}(call $c) {
338
+    /* great, caller has established conversation with the other party --
339
+       we can hang-up now */
340
+    $c.terminate();
341
+}
342
+
343
+
344
+
345
+D) Appendix: use of python
346
+-----------------------
347
+
348
+
349
+class App(SIPApplication):
350
+    def doInvite(req):
351
+        trans = req.transaction()
352
+dlg = req.dialog()
353
+app = dlg.application()
354
+
355
+if (req.uri().domain() == "voicemail.org"):
356
+    try:
357
+        media = req.sdp.negotiate()
358
+trans.reply(200)
359
+    except:
360
+trans.reply(500)
361
+
362
+    file = "/home" + req.uri().username() + "/ann.au"
363
+    if !file.exists():
364
+    file = "/ann.au"
365
+    media < file
366
+
367
+    file = "/home" + req.uri().username() + "/msg.au"
368
+    media.maxlength(200) > file
369
+
370
+    def doBye(req):
371
+        trans = req.transaction()
372
+trans.reply(200)
373
+req.dialog().media.stop()
374
+
375
+    def doHTTP(req):
376
+try:
377
+    dlg = placeCall(req.uri1)
378
+    dlg.media() < tts("just a moment")
379
+    dlg.refer(req.referto)
380
+    dlg.application().click = true
381
+
382
+except:
383
+    log "error"
384
+
385
+    def doNotify(req):
386
+        dlg = req.dialog();
387
+if dlg.application().click:
388
+    req.transaction.reply(200)
389
+    dlg.bye()
390
+else:
391
+    req.transaction.reply(...)
392
+
393
+    def doTimeout(app):
394
+        dlg = app.dialog("caller")
395
+        dlg.bye
0 396
new file mode 100644
... ...
@@ -0,0 +1,274 @@
1
+$Id$
2
+
3
+
4
+Draft Voicemail Architecture
5
+============================
6
+
7
+Jiri Kuthan, iptel.org, January 2003
8
+
9
+Abstract
10
+--------
11
+
12
+We describe design decision made when building media 
13
+support to iptel.org's SIP server suite. We discuss
14
+how to introduce a voicemail component most effectively,
15
+i.e., without voicemail programmer's too big involvement
16
+in SER. We also mention some design choices which
17
+can be in general made to couple external applications
18
+with SER.
19
+
20
+TOC
21
+---
22
+
23
+We first discuss interfacing methods used between SIP
24
+server/stack and applications in section 1, interfacing.
25
+We explain why we chose FIFO for the purpose.
26
+
27
+Section 2, IPC, gives details on use of FIFO, call flows
28
+examples and use of FIFO is detailed in Section 3.
29
+
30
+Possible extensions of the FIFO interface are mentioned
31
+in section 4.
32
+
33
+We show how the IPC/FIFO mechanisms compare to CGI-BIN
34
+which is architecturaly close in Section 5.
35
+
36
+1) Interfacing
37
+--------------
38
+
39
+A primary design objective is to hide SIP/SER internals from
40
+application builders. The SER code is not easy: it includes
41
+lot of shmem access along with its synchronizations, quite
42
+dynamic memory use and management. Data structures are rich
43
+and dynamic. That makes life of an application programmer
44
+quite difficult and is likely to result in higher bug rate. 
45
+Thus, it is desirable to decouple application from the stack.
46
+
47
+We have considered two approaches: API-based and FIFO-based.
48
+API-based approach takes a clean encapsulation of parser,
49
+memory management and other frequently used code in a library.
50
+The library should take away as much details as possible from
51
+application developer.
52
+
53
+While librarization of SER is a very desirable objective,
54
+it is a time-expensive task and we do not want it to become
55
+a road-block for application creation. That's the primary
56
+argument why we are going FIFO now. 
57
+
58
+There were technical arguments related to FIFO use in this
59
+context too. Some (myself) were arguing that FIFO provides 
60
+the cleanest separation of applications from ser. It is 
61
+language-independent, allowing use of effective scripting  
62
+languages and whatever an app programmer is familiar with. 
63
+It is no way tied to ser's architecture and the burden of 
64
+its parallel processing, synchronization, data structures 
65
+and memory management.
66
+
67
+Counter-arguments (by almost anyone else) against FIFO included
68
+concerns that SER will become too bloated by exporting too
69
+much of its functionality through FIFO. It is certainly 
70
+true that a technology may become a victim of its own
71
+success if it grows too big. SIP itself is unfortunately 
72
+becoming an example of such technology. 
73
+
74
+A demarcation line we agreed to draw was dialog maintenance,
75
+which shall stay away from SER whereas transaction-related
76
+stuff will stay in SER.
77
+
78
+
79
+2) IPC
80
+------
81
+
82
+1) voicemail server will not be cranked via fork/exec
83
+   as it is too expensive. Instead, it will be multi-
84
+   threaded and await INVITE's via its FIFO server.
85
+   SER will then dump incoming INVITE requests to
86
+   voicemail's FIFO server. (non-blocking) A drawback 
87
+   is that the FIFO server will not be able to inherit 
88
+   pre-parsed header fields in environment variables.
89
+
90
+2) subsequent requests, such as BYE, will take the
91
+   same FIFO path
92
+
93
+3) the external application will communicate with SER
94
+   using FIFO. For the purpose of replying original
95
+   INVITEs, there will be a t_fifo_reply command.
96
+   The command will identify a transaction to be 
97
+   replied using the pair hash:label. It will be further 
98
+   parametrized by first reply line, optional header fields and 
99
+   optional body. (The pair hash:label will have to be
100
+   communicated via the method described in 1.)
101
+
102
+4) to-tags will be generated in the external app.
103
+   That's a change from previous suggestions. It's
104
+   a consequence of moving process/thread control
105
+   from SER to the app. In general, to-tags identify
106
+   a call and thereby the process/thread associated with
107
+   it. So the generation of to-tags should be owned by
108
+   the piece responsible for spawning new processes/threads
109
+   -- this is the place which will have to dispatch
110
+   subsequent requests to previously spawned processes.
111
+
112
+5) BYE's from voicemail (on timeout) will be done using
113
+   fifo t_uac. fifo t_uac will have to be changed to
114
+   allow parametrization of call-id/cseq. (it is ephemeral
115
+   only now). Call-IDs and Cseq known from previous
116
+   requests will be passed to SER via FIFO as t_uac
117
+   parameters.
118
+
119
+6) As for CANCEL: the voicemail app doesn't care of it.
120
+   It is automated and responds immediately, CANCEL is thus
121
+   not relevant. It is responsibility of the transaction 
122
+   machine to take care of CANCELs. If they come when the 
123
+   transaction is still alive, the CANCEL will not affect 
124
+   the call state, it will be replied with 481 otherwise. 
125
+   See section 9.2 of RFC 3261 for details.
126
+
127
+3) Call Flows and FIFO Use
128
+---------------------------
129
+
130
+a) call setup
131
+
132
+---> ... SIP
133
+===> ... FIFO
134
+
135
+UAC               SER            VM
136
+         INVITE
137
+         ----->
138
+         100                          ; 100 generated automatically
139
+         <---                         ; by t_fifo
140
+                     t_fifo(INVITE)   ; if request acceptable, VM
141
+                     =========>       ; stores dialog state indexed by
142
+                                      ; newly created, unique to-tag and replies
143
+                     t_reply(200)
144
+                     <========
145
+                     200 ok           ; (FIFO/200 means tranaction found and
146
+                     ========>        ; reply accepted for delivery
147
+         200
148
+         <---
149
+
150
+b) voicemail terminates call on timer
151
+
152
+UAC              SER             VM
153
+                     t_uac(BYE)      ; VM generates BYE using dialog context
154
+                     <=========      ; created and stored on receipt of INVITE
155
+                                     ; (see session 8 in rfc3261, particularly
156
+                                     ; dealing with RR is tricky)
157
+        BYE
158
+        <---
159
+        200           200
160
+        ---->        ==========>     ; uac completed -- FIFO returns
161
+                     
162
+
163
+c) caller terminates
164
+
165
+UAC              SER              VM
166
+
167
+      BYE            t_fifo(BYE)
168
+     ------>         ==========>     ; VM attempts to look-up the call; on look-up 
169
+                                     ; failure or if CSeq low, it initiates t_reply 
170
+                     t_reply(200)    ; with a negative code; otherwise, it completes
171
+                     <==========     ; recording and confirms the BYE with a 200
172
+
173
+--
174
+use of FIFO:
175
+
176
+  t_fifo is a tm action -- it creates a new transaction (t_newtran),
177
+  sends a provisional 100 back (interaction with media component may
178
+  take long) and dumps request to a file (presumably media server's
179
+  FIFO):
180
+     t_fifo("/tmp/media_fifo", "some parameters");
181
+  The following items are dumped:
182
+  - t_fifo (media component may receive other requests)
183
+  - parameters
184
+  - to_tag (optimization for a quick dialog look-up)
185
+  - transaction identification: hash and label (used to refer
186
+    to transaction when replying)
187
+  - received requests
188
+  Eventually, t_fifo sets a timer (otherwise, the application could fail
189
+  to reply and transaction would be never released).
190
+
191
+  t_reply is a FIFO command, which is part of tm too -- it allows external 
192
+  apps to reply a pending transaction. It is parametrized as follows:
193
+  - to_tag (to be used if there is no tag in original request; important
194
+            for looking up dialog for future requests)
195
+  - transaction identifier: hash, table
196
+  - code
197
+  - phrase
198
+  - optional header fields and body
199
+  
200
+4) Possible Extensions of the FIFO interface
201
+---------------------------------------------
202
+
203
+All these extensions are thought to help coupling of external
204
+applications with ser.
205
+
206
+A reasonable alternative for some other applications would be to 
207
+use exec instead of FIFO for the SER->APP path. That would
208
+have the benefit of getting header fields conveniently 
209
+parsed in env vars in the same way like exec module does
210
+it (since release 0.8.11). Application would then not have to parse 
211
+header fields.  That however makes only sense if the executed apps are 
212
+small -- forking is expensive and all is much worse if the started 
213
+application is big and starts slowly. 
214
+
215
+Some other applications may wish to have other triggering
216
+points than just request receipt. For example, they may wish
217
+to be triggered on transaction completion (e.g., some
218
+accounting applications) or receipt of a reply
219
+(with the possibility to initiate  serial forking,
220
+for example). That is implementable: transaction completion
221
+exec can be implemented in a similar way like exec
222
+module runs and bound to transaction machine via
223
+a TM callback.  Care needs to be paid to the case
224
+of exec on reply receipt -- it is called from a callback 
225
+installed within reply processing mutex, which poses
226
+some implementation caveats: it has performance
227
+implications and deadlock potentials. In particular,
228
+an exec-ed app bound to reply processing could result
229
+in deadlock if it called FIFO/t_reply.  Also at least
230
+an evnironment variable describing reply status would
231
+have to be added, so that the script sees more than
232
+the original request.
233
+
234
+5) SIP CGI-BIN (RFC3050) comparison
235
+-----------------------------------
236
+SIP-CGI BIN is a nice mechanism for coupling external applications
237
+with SIP servers. It is textual, language-independent, separated
238
+from server processes. From this perspective, it is similar
239
+to SER's app-coupling mechanisms, which include execution of external 
240
+applications as in exec module potentialy integrated in TM's transaction 
241
+management. The reason why SER is somewhat different is of historical 
242
+nature: we have been trying to address mainstream scenarios with
243
+compact solutions. They developed in course of time to bigger beasts 
244
+comparable to SIP CGI-BIN today, but still different. We now try to
245
+explain how SER/FIFO/exec compares to SIP CGI-BIN. 
246
+
247
+   Note that there are some applications, in which ser's FIFO server
248
+   can be used whereas CGI-BIN is not applicable. Particular, the
249
+   FIFO server can be used if an application wants to initiate
250
+   transactions or dialogs. CGI-BIN is only evoked when server
251
+   (through receipt of a messages) want to run applications.
252
+
253
+Also, knowledge of the gaps may be used to implement CGI-BIN for
254
+SER, if ever wanted.
255
+
256
+Similarities:
257
+- both CGI and t_exec (as suggested in #4) can start external apps
258
+  on request receipt; retransmissions and other transaction burden
259
+  is handled by the server
260
+- both CGI apps and t_exec apps can steer proxy server's 
261
+  transaction logic; CGI apps do so by returning instructions 
262
+  on stdout, t_exec apps can do so through FIFO server
263
+
264
+SER Defficiencies:
265
+- enabling applications to remove header fields (CGI permits that)
266
+  through FIFO currently not possible -- there is no such a FIFO
267
+  command; should not be difficult to implement
268
+- request forwarding neither, for the same reason -- no such a FIFO
269
+  command; easy to change, though
270
+- the application can be re-execed on receipt of a reply from
271
+  a reply_route like in CGI BIN; however, there are no meaningful
272
+  FIFO actions that can be used; use of FIFO/t_reply can result
273
+  in a deadlock as reply_route is called from within a reply_lock,
274
+  which is initiated by t_reply called from FIFO server too